Lecture Notes in 
Computer Science 



1926 



Mathai Joseph (Ed.) 



Formal Techniques 
in Real-Time and 
Fault-Tolerant Systems 

6th International Symposium, FTRTFT 2000 
Pune, India, September 2000 
Proceedings 





Springer 






Lecture Notes in Computer Science 1 926 

Edited by G. Goos, J. Hartmanis and J. van Leeuwen 




Springer 

Berlin 

Heidelberg 

New York 

Barcelona 

Hong Kong 

London 

Milan 

Paris 

Singapore 

Tokyo 




Mathai Joseph (Ed.) 



Formal Techniques 
in Real-Time and 
Fault-Tolerant Systems 



6th International Symposium, FTRTFT 2000 
Pune, India, September 20-22, 2000 
Proceedings 




Springer 




Series Editors 



Gerhard Goos, Karlsruhe University, Germany 
Juris Hartmanis, Cornell University, NY, USA 
Jan van Leeuwen, Utrecht University, The Netherlands 

Volume Editor 
Mathai Joseph 

Tata Research Development and Design Centre 
54B, Hadapsar Industrial Estate, Pune 411013, India 
E-mail; mathai@pune.tcs. coin 



Cataloging-in-Puhlication Data applied for 

Die Deutsche Bihliothek - CIP-Einheitsaufnahme 

Formal techniques in real time and fault tolerant systems ; 6th 
international symposium ; proceedings / FTRTFT 200, Pune, India, 

Septemher 20 - 22, 2000. Mathai Joseph (ed.). - Berlin ; Heidelherg ; 

New York ; Barcelona ; Hong Kong ; London ; Milan ; Paris ; Singapore; Tokyo ; 
Springer, 2000 

(Lecture Notes in Computer Science ; 1926) 

ISBN 3-540-41055-4 



CR Subject Classification (1998): D.3.1, F.3.1, C.l.m, C.3, B.3.4, B.1.3 
ISSN 0302-9743 

ISBN 3-540-41055-4 Springer- Verlag Berlin Heidelberg New York 



This work is subject to copyright. All rights are reserved, whether the whole or part of the material is 
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, 
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication 
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, 
in its current version, and permission for use must always be obtained from Springer- Verlag. Violations are 
liable for prosecution under the German Copyright Law. 

Springer- Verlag Berlin Heidelberg New York 

a member of BertelsmannSpringer Science+Business Media GmbH 
© Springer-Verlag Berlin Heidelberg 2000 
Printed in Germany 

Typesetting: Camera-ready by author 

Printed on acid-free paper SPIN 10722874 06/3142 5 4 3 2 1 0 




Preface 



The six Schools and Symposia on Formal Techniques in Real Time and Fault 
Tolerant Systems (FTRTFT) have seen the field develop from tentative explora- 
tions to a far higher degree of maturity, and from being under the scrutiny of a 
few interested software designers and academics to becoming a well-established 
area of inquiry. A number of new topics, such as hybrid systems, have been ger- 
minated at these meetings and cross-links explored with related subjects such 
as scheduling theory. There has certainly been progress during these 12 years, 
but it is sobering to see how far and how fast practice has moved ahead in the 
same time, and how much more work remains to be done before the design of 
a mission-critical system can be based entirely on sound engineering principles 
underpinned by solid scientific theory. 

The Sixth School and Symposium were organized by the Tata Research De- 
velopment and Design Centre in Pune, India. The lectures at the School were 
given by Ian Hayes (U. of Queensland), Paritosh Pandya (Tata Institute of Fun- 
damental Research), Willem-Paul de Roever (Christian Albrechts U.) and Joseph 
Sifakis (VERIMAG). There were three invited lectures at the Symposium, by 
Werner Damm (U. of Oldenburg), Nicholas Halbwachs (VERIMAG) and Yoram 
Moses (Technion). 

A sizable number of submissions were received for the Symposium from aut- 
hors representing 16 different countries. The papers were reviewed by the Pro- 
gramme Committee, who along with other specialists made up a panel of 50 
reviewers. After electronic discussion by the Programme Committee, 21 papers 
were selected for presentation. 

The School and Symposium were organized by a committee consisting of 
Aditya Nori, Purandar Bhaduri and R. Venkatesh. They were assisted in no 
small measure by Sandeep Bodas, Kalyanmoy Dihingia, Adi Irani, Dinaz Irani, 
Shirish Lele, Nitin Purandare, Jyotsna Ravishankar and Parag S. Vazare, who 
all deserve particular thanks for all their help. 

The School and Symposium were supported most generously by Tata Consul- 
tancy Services and our thanks go to S. Ramadorai, the Chief Executive Officer. 

Finally, having survived the anxieties of organizing the first FTRTFT mee- 
ting in Warwick in 1988, it has given me great pleasure to participate in the 
organization of FTRTFT 2000 in Pune, which marks the first time that the 
meeting has been held outside Europe. 



September 2000 



Mathai Joseph 




Program Committee 



R. Alur (Univ. of Pennsylvania) 
A. Arora (Univ. of Ohio) 

H. Hannson (Malardalen Univ.) 

I. Hayes (Univ. of Queensland) 

L. Huimin (lOS, Beijing) 

H. Jifeng (IIST Macau) 

M. Joseph (chair) (TRDDC) 

Z. Liu (Univ. of Leicester) 

A. Mok (Univ. of Texas) 

K.V. Nori (TRDDC) 

P. Pandya (TIFR) 

A. Pnueli (Weizmann Inst) 

K. Ramamritham (IIT Mumbai) 

S. Ramesh (IIT Mumbai) 

A. Ravn (Aalborg Univ.) 

H. Rischel (TU Denmark) 

W.-P. de Roever (CAU Kiel) 

N. Shankar (SRI) 

J. Vytopil (KU Nijmegen) 

S. Yovine (VERIMAG Grenoble) 



Steering Committee 



M. Joseph (TRDDC) 

A. Pnueli (Weizmann Inst) 
W.-P. de Roever (CAU Kiel) 
J. Vytopil (KU Nijmegen) 



Organizing Committee 



P. Bhaduri 
A. V. Nori 
R. Venkatesh 




Organization VII 



Referees 



Tamarah Arons 
Parosh Aziz Abdulla 
Rana Barua 
Purandar Bhaduri 
Ahmed Bouajjani 
Alan Burns 
A. Cerone 

Supratik Chakrabarty 
Jing Chen 
Bruno Dutertre 
Kai Engelhard! 

Colin Fidge 
Felix C. Gartner 
Dimitar Guelev 
Nicolas Halbwachs 
Anna Ingolfsdottir 
Henrik Ejersbo Jensen 



Josva Kleist 
Kaare Kristoffersen 
Vinay Kulkarni 
Guangyun Li 
Xiaoshan Li 
Xuandong Li 
Kamal Lodaya 
Gavin Lowe 
Richard Moore 
Madhavan Mukund 
Kedar Namjoshi 
K. Narayan Kumar 

R. Narayanan 

S. Parthasarathy 
Sasi Punnekkat 
Zongyan Qiu 
Xu Qiwen 



John Rushby 
Partha S. Roop 
Manoranjan Satpathy 
Steve Schneider 
R.K. Shyamasundar 
G. Sivakumar 
Graeme Smith 
A. Sowmya 
Ashok Sreenivas 
Henrik Thane 
Dang Van Hung 
R. Venkatesh 
Thomas Wilke 
Wang Yi 
Naijun Zhan 



Sponsor 

Tata Consultancy Services and 

Tata Research Development and Design Centre 

54-B, Hadapsar Industrial Estate 

Pune-411013 

INDIA 




Table of Contents 



Invited Lectures 

Stability of Discrete Sampled Systems 1 

N. Halbwachs, J.-F. Fiery, J.-C. Laleuf, X. Nicollin 

Issues in the Refinement of Distributed Programs 12 

Yoram Moses 

Challenges in the Verification of Electronic Control Units 18 

Werner Damm 

Model Checking 

Scaling up Uppaal Automatic VeriRcation of Real-Time Systems Using 

Compositionality and Abstraction 19 

Henrik Ejersbo Jensen, Kim Guldstrand Larsen, Arne Skou 

Decidable Model Checking of Probabilistic Hybrid Automata 31 

Jeremy Sproston 

Fault Tolerance 

Invariant-Based Synthesis of Fault-Tolerant Systems 46 

K. Lano, David Clark, K. Androutsopoulos, P. Kan 

Modeling Faults of Distributed, Reactive Systems 58 

Max Breitling 

Threshold and Bounded-Delay Voting in Critical Control Systems 70 

Paul Caspi, Rym Salem 

Automating the Addition of Fault-Tolerance 82 

Sandeep S. Kulkarni, Anish Arora 

Reliability Modelling of Time-Critical Distributed Systems 94 

Hans Hansson, Christer Norstrom, Sasikumar Punnekkat 

Scheduling 

A Methodology for the Construction of Scheduled Systems 106 

K. Altisen, G. Gdfiler, J. Sifakis 

A Dual Interpretation of “Standard Constraints” in Parametric Scheduling 121 
K. Subramani , Ashok Agrawala 




X 



Table of Contents 



Validation 

Co-simulation of Hybrid Systems: Signal-Simulink 134 

Stephane Tudoret, Simin Nadjm-Tehrani, Albert Benveniste, 

Jan-Erik Stromberg 

A System for Object Code Validation 152 

A. K. Bhattacharjee, Gopa Sen, S. D. Dhodapkar, K. Karunakar, 

Basant Rajan, R. K. Shyamasundar 

Refinement 

Real-Time Program Refinement Using Auxiliary Variables 170 

Ian Hayes 

On Refinement and Temporal Annotations 185 

Ron van der Meyden, Yoram Moses 

Generalizing Action Systems to Hybrid Systems 202 

Ralph-Johan Back, Luigia Petre, Ivan Porres 

Verification 

Compositional Verification of Synchronous Networks 214 

Leszek Holenderski 

Modelling Coordinated Atomic Actions in Timed CSP 228 

Simeon Veloudis, Nimal Nissanke 

Logic and Automata 

A Logical Characterisation of Event Recording Automata 240 

Deepak D’Souza 

Using Cylindrical Algebraic Decomposition for the Analysis of Slope 

Parametric Hybrid Automata 252 

Michael Adlaide and Olivier Roux 

Probabilistic Neighbourhood Logic 264 

Dimitar P. Guelev 

An On-the-Fly Tableau Construction for a Real-Time Temporal Logic .... 276 
Marc Geilen, Dennis Dams 

Verifying Universal Properties of Parameterized Networks 291 

Kai Baukus, Yassine Lakhnech, Karsten Stahl 

Author Index 305 




Stability of Discrete Sampled Systems 



N. Halbwachs^, J.-F. Hery^, J.-C. Laleuf^, and X. Nicollin^ 

^ Verimag, Grenoble - France 
{Nicolas . Halbwachs , Xavier . Nicollin}@imag . f r 
^ EDF /DER, Clamart - France 
{ Jean-Francois .Hery , Jean-Claude . Laleuf }@der . edf gdf . f r 



Abstract. We consider the wide class of real-time systems that periodi- 
cally sample their inputs. A desirable property of such systems is that 
their outputs should be, in some sense, more precise when the sampling 
period gets shorter. An approximation of this property consists in requi- 
ring that, whenever the inputs don’t change, the outputs stabilize after 
a finite number of steps. We present a set of heuristics to check this sta- 
bility property, in the case of purely Boolean systems. These heuristics 
have been experimented on a nuclear plant control software, and have 
been shown to dramatically reduce the cost of stability analysis. 



1 Introduction 

Many real-time embedded systems, appearing in industrial control (e.g., plant 
supervision, flight control, . . . ), are periodic sampled systems. The global beha- 
vior of such systems is quite simple: it consists in a periodic loop, sampling inputs 
— from sensors or, more generally, from a shared memory — computing the cor- 
responding outputs, and updating the local memory for the next step. Generally 
speaking, this periodic behavior is a discrete approximation of an ideal, analog, 
behavior, which would instantly compute a continuous result from continuous 
inputs, and instantly react to any relevant discrete change of the inputs. With 
this intuitive intention in mind, one would naturally expect that the shorter 
is the period, the more accurate is the approximation. If we restrict ourselves 
to discrete control systems, shortening the period should only involve a more 
precise perception of input events — i.e., detecting transient changes, suitably 
ordering events that was previously considered simultaneous, ... — and a faster 
reaction to these events. Let us call monotonicity this intuitive property. 

Now, in actual discrete systems, it happens that, in some situations, some 
Boolean outputs of the system oscillate permanently while the inputs don’t 
change. This behavior obviously violates our intuitive notion of monotonicity, 
since shortening the period would only speed up the oscillation (see Fig. 1). 

Such a phenomenon is generally considered as an error, and appears to be 
very difficult to detect statically, since it can happen only in very specific states of 
the system. A system where it cannot happen is called stable: in a stable system, 
whenever the inputs remain unchanged, the outputs reach stable values after a 



M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 1-11, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




2 



N. Halbwachs et al. 



inputs 



output 

output ^ 
(faster period) 



n n n n n n n n 



Fig. 1. Output oscillation 



finite number of periods. Deciding system stability may require the knowledge 
of its whole state graph. 

Notice that stability is also an important concern in other domains, like 
in the superstep semantics of statecharts [HN96], or in sequential function 
charts [IEC93]. The problem of stability is also closely related to causality is 
synchronous languages and to the analysis of combinational loops in circuits 
[Mal93,SBT96,HM95,NK99]. 

The goal of this paper is to propose some heuristic techniques allowing, in 
most practical cases, 

— either to ensure the system stability without building its state graph; 

— or to focus the problem to some small parts of the system, corresponding to 
small parts of the state graph, which can be efficiently built. 

We restrict ourselves to discrete sampled systems: they are formalized in Sec- 
tion 2, as standard Mealy machines, which can classically be represented as 
operator networks (sequential circuits), made of combinational Boolean gates 
and Boolean memories. In Section 3, we formally define the notion of stability, 
and the little stronger property that will be actually analyzed: it requires that 
whenever the inputs stay stable, the outputs and the state (i.e., the memories) 
stabilize after a finite number of steps. In Section 4, we show that each strongly 
connected component (SCC) of the operator network can be analyzed in turn, 
and that the stability of some simple SCCs, containing only one memory, can be 
shown. In Section 5, we use local necessary conditions of unstability, called local 
cycle conditions, the unfeasibility of which is often easy to show. In Section 6, we 
broaden these cycle conditions, using approximate values of the involved stable 
variables. 



2 Definitions and Notations 

Throughout the paper, we will assimilate discrete systems with standard Mealy 
machines. Let IB = {0, 1} be the set of Boolean values, and let us note the 
disjunction additively, the conjunction multiplicatively, and the negation by the 




Stability of Discrete Sampled Systems 



3 



“&ar” notation (/ = ~'f)- A Mealy machine with n state variables, m input 
variables and p output variables, is a pair (r, 5) where 

— T is a total function from to IB" (transition function) given as a vector 

['Tfc]fc=i,..n of n functions from IB'"'*'" to IB. If s G IB" (state), i G IB™ (input), 
r(i, s) denotes the vector [Tk{i, s)]k=i...n G IB" (next state from s for l). 

— (5 is a total function from ]B to IB*’ (output function), also given as a 
vector [5t\i=i,,,p of p functions from IB™''"" to IB. i5(t, s) is the output in s 
for L. 

An input sequence to the machine is an infinite sequence I = (to, ti, • ■ • , ti, • ■ •) 
of vectors of IB™. The run and the image of the machine on such an input 
sequence I are, respectively, the sequence S = (sq, Si, ■ • ■ , Sj, . . .) of elements 
of IB" (states), and the sequence O = (wq, Wi, . . . , Wj, . . .) of elements of IB*’ 
(outputs), defined by 

So = 0" (initial state*^) and Vz > 0, s^+i = r(ti, Sj), uji = 5{ii, Si) 

More concretely, we will name input, output, and state variables by identifiers, 
and define the transition functions using the “prime” notation (read x' as next 
x). 

Example 1 : For instance, the system of Boolean equations: 

x' = h {a + x) y' = z + y.x z' = y {x + z) u = y + z 

is a convenient way of describing a machine with 2 input variables (a, 6), 3 
state variables (x,y,z), and 1 output variable (u), with 

ri(z, s) = -i6[2] a (z)!] V s[l]) T 2 (z, s) = -'s[3] V (s[2] A -'s[l]) 

s) = “'s[2] A (~'s[l] V s[3]) S(e, s) = s[2] V s[3] 

Another classical way of describing a Mealy machine is by an operator 
network. Boolean functions are described by their gate networks, and state 
variables correspond to memories. Fig. 2 shows a network corresponding to the 
above machine. 

An operator network can be viewed as a directed graph, the nodes of which 
are the operators and memories of the network, and the edges of which are the 
“wires” oriented according to the direction of data circulation. In such a graph, 
we will use the standard notion of “strongly connected components^” (SCC for 
short): In Fig. 2, the two SCCs of the network are shown in dashed boxes. Let 
us recall that SCCs can be determined in linear time [Tar72]. 

* We could have let the initial state be a parameter of the machine. The choice of 0" 
is for simplicity. 

^ i.e., a subset of operators, each pair of which is connected by a directed path. 




4 



N. Halbwachs et al. 




Fig. 2. a Mealy machine as an operator network 



Finally, we will make an extensive use of the following 
representation of memories by two-state automata: let x be 
a state variable, whose evolution is defined by the equation 
“x' = /a;”. Let us consider the Shannon expansion of the 
Boolean function according to x: = x -f^ + x.f^, where 

/° and are independent of x. Then /° (resp., ) is the 
condition that sets the memory x from 0 to 1 (resp., that i"ssetx 
resets it from 1 to 0); we will note it setx (resp. resetx). As a 
consequence, each memory x has a canonical representation 
as the opposite two-state automaton. 




For instance, in Example 1, the state variable z is defined by the equation 
z' = y{x+z).lts Shannon expansion according to z is z' = ~z y x + zy , so 
setz = y X and resetz = y. 



3 Stability 

We formalize, now, the notion of stability we want to analyze: an infinite sequence 
I = (io, ti, • • • , tij • ■ •) is said to be ultimately stable if it is constant from some 
term, i.e., if 

€ IN such that ij = h 

A machine is said to be weakly stable if and only if, on every ultimately stable 
input sequence, its image (sequence of outputs) is ultimately stable. 

First notice that this property is neither trivial to specify, nor to verify. As 
a temporal logic formula [Pnu77], it would be written like 



□ 



□ (/. = OO ^ ^i=i(w = Quj) 



and verifying it by model-checking [VW86] is likely to be very expensive. 





Stability of Discrete Sampled Systems 



5 





Fig. 3. A simple machine and its state graph 



Weak stability is the property we are interested in, in view of the arguments 
given in the Introduction. In fact, we will consider a stronger property, because it 
is easier to check, and because experience shows that, on actual applications, it is 
generally very close to the wanted property. This stronger property states that, 
in presence of ultimately stable inputs, the state stabilizes in a finite number of 
steps: 

A machine is said to be strongly stable (or simply stable) if and only if, on 
every ultimately stable input sequence, its run (sequence of states) is ultimately 
stable. Obviously, strong stability implies weak stability. 

Example 2: Fig. 3. a shows the network of a machine with 1 input a, 2 state 
variables a;,y, and one output x, defined by the system of equations 

x' = a{x + y) y' = ay + xy 

Fig. 3.b shows the graph of reachable states of the machine. Possible un- 
stabilities appear on this graph as circuits: apart from self- loops (which cannot 
correspond to unstabilities), the graph contains 3 elementary circuits: 

(1) {x y)-^{xy)-^{xy)-^{x y) _ 

( 2 ) {xy)-^{xy)-^{xy)-^{xy)-^{xy) 

(3) {xy)^{xy)-^{xy) 

Circuits (1) and (2) don’t correspond to unstabilities, since their traversal invol- 
ves input changes. Now, circuit (3) corresponds to an oscillation when the input 
a remains true. This oscillation does not change the output x but makes the 
state variable y oscillate. As a consequence, the machine is weakly stable, but 
not strongly stable. 

This example shows that both weak and strong stability can be checked on 
the state graph of a machine, by examining its elementary circuits. Now, this 
method is clearly expensive, and unfeasible for complex systems. This is why we 
investigate, now, heuristic methods to avoid the construction of the state graph. 





6 



N. Halbwachs et al. 



In the current state of our research, these heuristics only apply to strong stability. 
Let us stress out that, in the rest of the paper, we will never consider state graphs 
and their circuits, but only the operators networks and their strongly connected 
components! 

4 Strongly Connected Components 

A very first (obvious) remark is that a memory cannot oscillate if its input is 
stable; so, a memory can only cause an unstability if it appears in a loop of 
the network. An important consequence is that we can consider each strongly 
connected component of the network separately: 

— An see of the network is stable, if, considered alone, it represents a stable 
machine. A sufficient condition for a machine to be stable is that all the 
sees of its network be stable. 

— A memory that doesn’t belong to an See cannot introduce unstability. 

Another simple case concerns Sees that contain only one memory: let this 
memory be associated with a state variable x, defined by the equation “x' = 
fx(x,yi, . . . ,yk)” , where the are either inputs or state variables from other 
sees. The function fx is said to be monotonic in x, if, for each valuation of 
yi, ...,yk, the implication fx{^,yi, ■ ■ ■ ,yk) ^ /x(l,yi, ■■■,yk) holds. 

Condition 1: If an See contains only one memory x, and if fx is monotonic, 

the considered See is stable. 

For instance, in our Example 1 (Fig. 2), the network has two Sees: the 
former contains only the memory x, and the latter contains both y and z. Then, 
since the function defining x, b{a + x), is monotonic in x, the former See is 
stable, and we only have to analyze the second See, considering x as its only 
input. 



5 Local Cycle Conditions 

Our second criterion looks very weak at first glance, but provides surprising good 
results [HL97]: for a memory x to oscillate, it must be able to change in both 
directions with the same values for input variables. More formally, there should 
exist two instants t\ and t 2 at which the inputs are the same, and such that setx 
holds at ti and resetx holds at O- If t (resp. s) represents the vector of input 
(resp., state) variables, if Si and S 2 represent the values of state variables at ti 
and ^ 2 ) respectively, this condition can be written 



3/,, Si,S2, setx{L,Si).resetx{L,S2) = 1 

Moreover, state variables that have already been found stable can be considered 
(as inputs) to have the same values in Si and S 2 . So, in general, we will have 





Stability of Discrete Sampled Systems 



7 



a set (7 of stable variables (including inputs) and a set ^ of remaining variables, 
and, for each x G the considered condition will be 

3cr, set^{(T,^i).reset^{(T,^ 2 ) = 1 

Let us note the condition setx{cr,^i).resetx{cr,^ 2 ) and call it 

the local cycle condition for x in a. 

Condition 2: If cr is a set of inputs or stable variables, the unsatisfiability 
of is a sufficient condition for x to be stable^. 

This provides us with an algorithm to analyze an SCC of the network: let t 
and s be, respectively, the set of input and of state variables of the SCC. The 
following algorithm returns in a (resp., in the set of memories that have been 
found stable (resp., the stability of which is not guaranteed): 

Algorithm 1: start with a = l; ^ = s; 

while G ^ such that = 0 do 
C := ?\ {a;}; o- := crU {x}; 
end while 

Example 3: Let us consider an SCC with 2 inputs a and b, and two state variables 
X and y, defined by x' = y + d y' = b{x + y) 

We have setx = y + d , resetx = y.a , sety = x.b , resety = b 
Starting the algorithm with a = {a, b}, ^ = {x, y}, we get 

= 3yi,2/2, setx{a,b,yi).resetx{a,b,y 2 ) 

= 3yi,?/2, {yi + d){y 2 -a) 

= 3yi,?/2, yi-y2-a 
= a 

^ 0 (so, nothing can be concluded for x) 

Uy =3a;i,X2 sety{a,b,Xi).resety{a,b,X2) 

= 3a;i,X2 (xi.b).b 

= 0 

So, y is stable, and we iterate with a = {a,b,y} and ^ = {x}. For this new a, 
we get 

= setx{a,b,y).resetx{a,b,y) 

= {y + d).{y.a) 

= 0 

and x is found stable too. 

A more precise condition can be found, by considering pairs of depending 
state variables in the same SCC. Let T>{x) be the set of state variables that 
belong to the same SCC as x, and appear either in setx or in resetx- Assume 
the previous algorithm failed to show the stability of two variables x, y, with 
y G 'D{x). We already know that, when Uy is false, y is stable. So we can split 
the definition of Ux into two cases: 

® Notice that Condition 1 is only a special case of Condition 2. 





N. Halbwachs et al. 



— the case where Uy holds 

— the case where Uy = 0, in which case the value of y can be assumed to be 

stable (i.e., = ya)- 

This leads to a stronger condition of unstability for x: 

= 3 ^ 1 , ^ 2 , {U^.seto,{(T,^i).reset^{(T,^2)) 

+ {Uy .seta;{cr,^x).resetx{(T,^2)-{yi = 2 / 2 )) 



Condition 3: If cr is a set of inputs or stable variables, if y G T^{x), the 
unsatisfiability of U^y is a sufficient condition for x to be stable. 

Obviously, this condition is more likely to succeed for x and y satisfying 

K-Uy = 0. 

When Algorithm 1 fails to show that all memories of an SCC are stable (i.e., 
returns ^ yf 0), we can apply the following one: 

Algorithm 2: while 3a; G f such that 3y G V(x) such that Uil ,, = 0 do 

^ := ^\ {a;}; a := cr U {x}; 
end while 

Of course, this approach could be continued with 3 variables and more, but 
would become more and more expensive. 

Example: Let’s come back to Example 1. In Section 4, we showed that the first 
SCC of the network — reduced to the state variable x — , is stable. Now, if we 
apply Algorithm 1 to the second SCC, we get: 

a = {x} , ^ = {y,4 
Uy = 3zi,Z2, x.z1.z2 = X 
= 3yi,y2, x.lE.y 2 = x 

So, nothing can be deduced about the stability of y and z. Now, algorithm 2 
provides: 

^y,z — X -X + x.{3z, x.z .z) = 0 
Uly = x.x + x.{3y, x.y.y) = 0 

from which we can conclude that y and z are stable. 

6 Approximation of Stable Values 

When the previous heuristics fail to show the stability of an SCC, the SCC 
contains variables with feasible cycle conditions U : these are formulas involving 
input and state variables, some of which are already found stable. We can try to 
show that no stable values of the stable state variables can make the cycle con- 
dition feasible. Now, we don’t know the possible stable values of state variables. 
Here, we propose [HL97] to use an approximation of these values: as a matter of 
fact, for each state variable x, we have the following information about its stable 
values: 





Stability of Discrete Sampled Systems 



9 




Fig. 4. The network of Example 4 



When X stabilizes to 1, surely resetx is false 
When X stabilizes to 0, surely setx is false 

In other words, the stable values x of x satisfy: setx ^ x ^ resetx 

So, if, in a cycle condition U, we replace each occurrence of x (resp., of ir) 
by resetx (resp., by setx ) we get a new condition U which is weaker than U 
(U ^ U'). If U' happens to be identically false, so is U, and the considered cycle 
cannot introduce unstability. Of course, this can be tried with all state variables 
appearing in U, or with combinations of these variables. 

Example 4- Let us consider the system represented in Fig. 4, whose system of 
equations is: x' = h {a + x) , w' = b.w + x .w . 

There are two SCCs, one containing x and the other containing w. The first 
one is found stable, as in Section 4. In the second SCO, none of our previous 
heuristics works: 

~ fw = b.w + X .w is not monotonic in w 

- = b.x ^ 0 

- V{w) = 0 

Now the local cycle condition for w in {a, 6, x} is = b.x. Since resetx = b, 

we get U' = b.b , which is identically false. This shows that w is stable. 

7 Application and Future Work 

From an industrial point of view, stability is by no means a theoretical problem. 

The first reason is that oscillating Booleans are frequently observed in nuclear 
power plants. This may result from various causes, including hysteresis pheno- 
mena concerning analog signals, or other physical or technological reasons; but 
there has been evidence of such oscillations due to the functional specification le- 
vel itself. In these cases, formal analysis would have detected the problem before 
it actually occured. 

The second implication of instability concerns distributed command: as many 
industrial systems are implemented on several processors (and this is obviously 
the case in power plants), stability often appears as a necessary condition for the 
global validity of the command system. Therefore, it may be useful to predict 
both stability itself, and the maximum delay within which it is reached. 





10 



N. Halbwachs et al. 



Being aware of these reasons amongst others, the French nuclear safety aut- 
hority (IPSN) is keeping a close look on formal methods, even if it has not 
formally recommended them up to now. Anticipating the recommendation, the 
French utility Electricite de France (EDF) has already undertaken an R&D pro- 
gram on the subject; it is interesting to point out that Boolean stability was the 
first item to be studied. 

The techniques proposed in this paper have been experimented on actual 
systems (control software for nuclear plants). All the presented examples come 
from this actual software, and were considered to be problematic before being 
analyzed. These experiments show that the proposed heuristics give precise re- 
sults, in practice: almost all the cases which were not found stable presented 
actual unstability problems. These experiments were performed by hand, but 
the feasibility of such a manual application shows that its cost has nothing to 
do with that of exact verification, a point which is highly confirmed by the first 
experiments with the implemented prototype. 

These results are so encouraging that an actual industrial tool for checking 
stability is under implementation. The tool takes system descriptions written in 
Lustre [HCRP91], or by means of “logical functional diagrams” used at EDF. 
All the Boolean conditions are dealt with using BDDs [Ake78,Bry86]. 

Stability should be a very interesting concept in synchronous program- 
ming [Hal93]. In fact, the stable state graph is an abstraction very similar — on 
a macroscopic level — to the synchronous abstraction, that considers as atomic 
all the “micro-steps” performed in the same reaction. From this point of view, 
stability is the macroscopic counterpart of the “causality” [SBT96,HM95,NK99] 
property, considered in synchronous programs. An interesting perspective would 
be to adapt the techniques proposed here to obtain a cheap causality checker for 
synchronous programs and circuits. 

The notion of stability can also be very fruitful in verification. For the kind 
of systems we consider, critical properties are very likely to be required to hold 
only in stable states. In this case, only the reduced graph of stable states has to 
be examined, and this can dramatically reduce the cost of verification by model- 
checking. For instance, in a machine with n inputs, if all the inputs are sampled 
in a memory before being used, the number of states is multiplied by 2" with 
respect to the number of stable states. 

References 

Ake78. S. B. Akers. Binary decision diagrams. IEEE Transactions on Computers, 
C-27(6), 1978. 

Bry86. R. E. Bryant. Graph-based algorithms for boolean function manipulation. 

IEEE Transactions on Computers, C-35(8):677-692, 1986. 

Hal93. N. Halbwachs. Synchronous programming of reactive systems. Kluwer Aca- 
demic Pub., 1993. 

HCRP91. N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous da- 
taflow programming language Lustre. Proceedings of the IEEE, 79(9) : 1305- 
1320, September 1991. 




Stability of Discrete Sampled Systems 



11 



HL97. 

HM95. 

HN96. 

IEC93. 

Mal93. 

NK99. 

Pnu77. 

SBT96. 

Tar 72. 
VW86. 



J. -F. Hery and J.-C. Laleuf. Stabilite de la realisation des DFL. Technical 
Report Electricite de France, 1997. 

N. Halbwachs and F. Maraninchi. On the symbolic analysis of combinational 
loops in circuits and synchronous programs. In Euromicro ’95, Como (Italy), 
September 1995. 

D. Harel and A. Naamad. The Stalemate semantics of Statecharts. ACM 
Transactions on Software Engineering and Methodology, 5(4), October 1996. 
lEC. International standard for programmable controllers: Programming 
languages. Technical report iecll31, part 3, International Electrotecnical 
Commission, 1993. 

S. Malik. Analysis of cyclic combinational circuits. In ICCAD’93, Santa 
Clara (Ca), 1993. 

K. S. Namjoshi and R. P. Kurshan. Efficient analysis of cyclic dehnitions. 
In 11th International Conference on Computer Aided Verification, CAV’99, 
Trento (Italy), July 1999. 

A. Pnueli. The temporal logic of programs. In 18th Symp. on the Founda- 
tions of Computer Science, Providence R.I., 1977. IEEE. 

T. R. Shiple, G. Berry, and H. Touati. Constructive analysis of cyclic 
circuits. In International Design and Testing Conference IDTC’96, Paris, 
France, 1996. 

R. E. Tarjan. Depth-first search and linear graph algorithms. SIAM Journal 
on Computing, 1:146-160, 1972. 

M. Y. Vardi and P. Wolper. An automata-theoretic approach to automatic 
program verification. In Symposium on Logic in Computer Science, June 
1986. 




Issues in the Refinement of 
Distributed Programs 
(Invited Talk) 



Yoram Moses 

Department of Electrical Engineering 
Technion — Israel Institute of Technology 
Haifa, 32000 Israel 
mosesSee . technion .ac.il 

Developing correct computer programs is a notoriously difficult task, which 
has attracted a significant intellectual effort over the past decades. One attractive 
methodology that has been proposed to tackle this problem consists of systems 
for program refinement, in which a calculus is given for transforming, often in 
a top-down manner, the specification of a computational task into a program 
implementing this specification (excellent introductions to refinement are Back 
and von Wright 1998 and Morgan 1994). Calculi for the refinement of sequential 
programs are by now a mature and well-established field. In this abstract, I wish 
to discuss some issues that arise when we try to develop a refinement calculus 
for distributed programs. This discussion is based on a joint project with Ron 
van der Meyden and Kai Engelhardt of the University of New South Wales, 
Sydney, Australia. Some insight into the technical aspects of the approach we 
are pursuing can be found in Engelhardt et al. 1998 and 2000 and in van der 
Meyden and Moses 2000.^ An obvious point to start a discussion of refinement for 
distributed programs is the sequential case. The subtlety and inherent complexity 
of distributed systems make the task of refinement for distributed programs much 
harder. The purpose of this abstract is to discuss, in an informal fashion, some 
of the distinctive issues that seem to play a role in this effort. The hope is that a 
discussion of these issues may contribute to other work on formal and algorithmic 
approaches to distributed computation. 



On sequential refinement: We have described the goal of a refinement 

calculus as being the transformation of a specification into an implementation. 
Roughly speaking, then, we start out with an object of one type — specification — 
and end up with an object of another type — an executable program. In the process 
of transforming the former to the latter we may have “intermediate” objects that 
do not qualify as being pure specifications nor as being executable programs. To 
overcome this disparity, it is common to define a larger class of programs that will 
contain specifications and executable programs, as well as all of the intermediate- 
form programs that can arise in the course of the refinement process. A natural 
question, then, is what this space should consist of. For sequential programs, 

^ Insights described here have been obtained as part of this joint work; any mistakes 
or misrepresentations are my own doing. 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 12-17, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Issues in the Refinement of Distributed Programs 



13 



the picture is simplified by the fact that the model and goals are clear and well- 
defined. A sequential program starts at an initial state and, if it ever halts, ends 
at a final state. Moreover, its input is provided in the initial state, and its output 
is given in the final state. Notice that a nonterminating execution of a such a 
program is considered useless. Typically all that we care about is the possibly 
partial input/output relation that this program instantiates. We can therefore 
identify programs with input/output relations. Specifications of a program can 
be described in terms of a desirable input /output relation, and concrete program 
commands can also be given semantics in this fashion. A more general view is 
to consider a program as a predicate transformer, following Dijkstra 1976: In 
this case, we identify a program with the change it brings about to the truth of 
predicates on the program’s state. This space is larger, it has very elegant logical 
properties, and it has proven a very useful basis for the refinement of sequential 
programs. 

What should the program space be for distributed programs? The distributed 
settings adds complexity in a number of different ways. First, we believe that the 
notion of the state of a distributed computation is much less obvious than in the 
sequential case. Second, the concurrency of the computation adds a whole layer 
of complexity. Third, there are many models in which distributed computations 
are carried out, and a refinement calculus need not fully commit to a particular 
model at the outset. Finally, in a distributed system there can be many tasks 
that are reactive and ongoing. As a result, nonterminating programs are often 
desirable or even necessary. We now consider some of these issues in greater 
detail. 



Distributed States — transition vs. composition: The state plays two 

important roles in the context of a sequential computation. One is to be the 
object that actions performed by the program modify. Thus, for example, an 
action such as setting the variable x to 1 can be thought of as being applied 
to the state, resulting in a new state that differs from the first only in that the 
value of X is 1. With every program action we can thus associate a transition 
function on the states. A second role the state plays is being the start and end 
point of programs: A program will start at a well-defined state (its initial state), 
and if it terminates will end at a state (its final state). When we compose two 
programs p and Q by running P followed by Q, the final state of program P 
will be the initial state of Q. Moreover, whatever properties the output of P is 
guaranteed to have can be used as valid assumptions about the input of Q in 
this case. We thus view the state as playing a role in the transition caused by 
actions, and a role as the location at which control is passed from one program 
to the next in sequential composition of programs. The reader is fully justified in 
doubting the value of what has just been said. The distinction drawn between 
the two “roles” of the sequential state might not be all that convincing. After 
all, we can associate a transition function with a terminating program as well, 
and view a such a program as a slightly generalized action. This distinction will 
hopefully be more vividly drawn out when we consider the distributed case. 




14 



Y. Moses 



What does the notion of a state become when we move to a distributed mo- 
del? Perhaps the most popular answer taken in the literature is the state of a 
distributed system amounts to an instantaneous “snapshot” of the system. This 
is sometimes called the global state of the system, and sometimes it is called 
the configuration of the system (cf. Lynch Lynch 1996, Attiya and Welch 1998). 
Typically it will consist of local states for the different active processes in the 
system, as well as states for inactive elements such as communication channels, 
shared variables and the like. Clearly, the effect of actions performed by the 
processes in a distributed system depend on the global state. The actions in- 
deed transform the global state and we would argue that the global state is the 
analogue of a sequential state for the purpose of transition. 

The analysis of distributed algorithms and distributed programs in general 
is often performed for each specific task in isolation. It is typically carried out in 
essentially the same terms as we have described for the sequential case: Such a 
program will start in an initial (global) state and end in a final state that results 
when all participating processes have completed carrying out their individual 
tasks in the program. While this is adequate for the analysis of a given task in 
isolation, it is sometimes less so when solutions are to be composed. Let us con- 
sider an example of a refinement of a distributed task that breaks it into smaller 
subtasks. Suppose our goal is to perform a vote among the processes in a distri- 
buted system. Consider a refinement of this task into three parts: (a) Compute 
a minimum spanning tree (MST) of the network; (b) elect a leader using the 
MST; and (c) coordinate the vote through the leader. Clearly, these operations 
should be performed in sequence: Each of them relies on the completion of the 
previous ones. 

How should we compose the solutions? Solutions to each of the subtasks are 
typically assumed to start operating at a well-defined initial global state. But 
solutions to the MST problem or to leader election are not guaranteed to ter- 
minate at a well-defined global state (see, e.g., Gallagher et al. 1983). The fact 
that the last process to perform an action on behalf of one of these tasks has 
completed doing so does not immediately become known to all of them. It is 
possible to overcome this problem by performing a synchronization step at the 
end of the intermediate steps, say in the form of termination detection (Fran- 
cez 1980). The cost of synchronizing the processes in the network can be high, 
however, and should be incurred only if needed. In our particular example this 
should not be necessary. 

Extending the intuitions underlying the work on communication-closed lay- 
ers (Elrad and Francez 1982, Chou and Gafni 1988, Stomp and de Roever 1994, 
Zwiers and Janssen 1994), we prefer to view distributed programs as operating 
between cuts, where a cut specifies an instant on the time-line of each of the 
processes. Intuitively this means that, as far as sequential composition is concer- 
ned, cuts constitute a distributed analogue of states of a sequential computation. 
With this view, it is possible to sequentially compose terminating programs such 
as those in our voting example. 




Issues in the Refinement of Distributed Programs 



15 



The seemingly small move of viewing distributed programs as operating bet- 
ween cuts has considerable repercussions. For example, we now need to consider 
much more carefully what assumptions must be made about the initial cut of a 
given program, to ensure the program behaves in the desirable fashion. In some 
cases, the initial cut needs to be fairly tightly synchronized, perhaps a global 
state or something very close to it (a consistent cut or a communication-closed 
cut). In other cases, we can make do with much less. For example, in many cases 
it suffices that a message that is sent by process i in the course of executing its 
portion of the distributed program P and is received by j before j crosses the 
initial cut of P, will be presented again to j after j starts executing its portion 
of P. Properly formalized, such a condition is all that is needed to solve our 
voting example in a reliable asynchronous model, without needing to perform 
costly synchronization between the three layers of the computation. We believe 
that the issues of synchronization between distributed activities raised by this 
form of refinement of distributed tasks deserve more attention than they have 
received in the past. 

High-level programs: It is possible to extend the discussion on properties 
and assumptions on initial cuts one step further. Informally, let us define a cut 
formula to be a formula that is interpreted over cuts. For every cut formula ip 
there will be a set of pairs r, c where r is a run and c is a cut, at which ip will 
be considered true, and false for all other pairs. Cut formulas can be treated 
just like state formulas in a sequential computation. In fact, we can now define 
branching and iteration at the level of distributed programs. If P and Q are 
distributed programs and is a cut formula, we can define 

if if then P else Q and while ip do P 

as high-level distributed programs. High-level programs of this nature resemble 
structured programs in sequential computing. But there is an important diffe- 
rence between the two. Whereas in the sequential case the tests in if and while 
statements are usually executable and can be accepted by the compiler as such, 
the corresponding tests on cut formulas are not expected to be executable. Ne- 
vertheless, such high-level constructs of distributed programming can serve as 
rigorous descriptions of the desired behavior. This, in turn, can be helpful in the 
process of designing a distributed program, when the designer wishes to trans- 
form a specification of a desired program into a concrete implementation. For 
example, we might decide to implement a program for computing the MST of 
a network in an incremental fashion by repeatedly adding an MST edge to the 
forest under construction. This will correspond to executing a loop of the form 
while if do P, where ip is the cut formula stating that the MST is not completed 
yet, and P is a distributed program that adds one edge to the MST. We shall 
return to this point when we discuss refinement below. 

Termination: We have argued that considering programs as operating between 
cuts provides us with an improved facility for handling sequential composition 




16 



Y. Moses 



of distributed programs. We need to assume, of course, that each participant in 
a distributed program ultimately terminates its participation in the program, 
before it can go on to performing the next program. This is taken for granted 
when dealing with sequential programs, where a nonterminating computation 
amounts to an outright failure and is usually totally undesirable. When we deal 
with distributed programs, however, there are common settings in which it is 
provable that interesting distributed programs cannot guarantee termination. 
For example, the work of Koo and Toueg 1988 shows that in a setting in which 
messages may be lost but communication channels are fair, every protocol that 
requires the successful delivery of at least one message in each run, must have 
executions in which one or more of the processes cannot terminate. This appears 
to be an issue if we are after sequential composition of programs in this model. 

An example of a nonterminating program in this context is the standard pro- 
tocol for sending a single bit between a sender S and a receiver R. The sender’s 
program is to send the bit repeatedly until it receives an acknowledgement. The 
receiver, in turn, sends one acknowledgement for each message it receives. A 
close analysis shows that the receiver must forever be ready to send one more 
acknowledgement in case it receives another copy of the bit. This example indi- 
cates that while the receiver cannot safely reach a terminating state with respect 
to the bit transmission program, the situation is not similar to the divergence 
of a sequential program. Despite not reaching a terminating state, the receiver 
should not be expected to abstain from taking part in further activities, provided 
that the receipt of a new copy of the bit will prompt the sending of an additional 
acknowledgement . 

Inspired by the work of Havelund and Larsen 1993, we subscribe to an ap- 
proach by which part of the activity of a process can be considered as taking 
place “in the background.” This activity need not terminate, and if it does, its 
termination need not cause another activity to start. We call a program that 
operates in the background a forked program, and have an explicit fork(P) ope- 
rator in our framework. In the bit transmission problem, for example, we can 
view the receiver having a top-level or “foreground” activity consisting of waiting 
for the sender’s message, and consuming it, or making use of the information 
therein once it arrives. The “background” activity, which the receiver would 
happily delegate to an assistant or, say, the mail system, involves sending the 
acknowledgements. Once the message is obtained by the receiver, the receiver 
can proceed to its next task. A similar approach is readily applicable to a variety 
of well-known protocols and problems. 

A separation of concerns between foreground and background activities ma- 
kes sequential composition of programs possible again even in models or for 
problems where termination is not a simple matter. It also facilitates modelling 
various delicate aspects of concurrency in a natural fashion. 

In summary, this abstract has covered only a few of the issues that arise when 
we attempt to develop a refinement calculus for distributed programs. We believe 
that the concerns raised by such a top-down view of distributed programming 
should receive more attention than they have in the past, and could give rise to 




Issues in the Refinement of Distributed Programs 



17 



new problems and techniques in the fields of distributed algorithms and program 
refinement. 



References 

Attiya, C., Welch, J.L.: Distributed Computing: Fundamentals, Simulations and Ad- 
vaneed Topics. McGraw-Hill (1998) 

Back, R. J., von Wright, J.: Refinement Calculus: A Systematic Introduction. Springer 
Verlag Graduate Texts in Gomp. Sci. (1998) 

Ghou, G., Gafni, E.: Understanding and verifying distributed algorithms using stratified 
decomposition. Proc. 7th ACM PODC (1988) 44-65 
Dijkstra, E.W.: A Discipline of Programming. Prentice Hall (1976) 

Engelhard!, K., van der Meyden, R., and Moses, Y.: Knowledge and the logic of local 
propositions, Proc. 7th Conf. on Theor. Aspects of Reasoning about Knowledge 
(TARK), Gilboa, T. Ed., Morgan Kaufmann (1998) 29-42 
Engelhard!, K., van der Meyden, R., and Moses, Y.: A program refinement framework 
supporting reasoning about knowledge an time. Foundations of Software Science 
and Computations Structures, Tjuryn J. Ed., Springer Verlag (2000) 114-129 
Francez, N.: Distributed Termination. ACM Trans. Prog. Lang, and Syst., 2(1) (1980) 
42-55 

Gallager, R., Humblet, P., Spira, P.: A distributed algorithm for minimum- weight span- 
ning trees. ACM Trans, on Prog. Lang, and Syst., 5(1) (1983) 66-77 
Elrad, T., Francez, N.: Decomposition of distributed programs into communication- 
closed layers. Sci. Comp. Prog., 2(3) (1982) 155-173 
Havelund, K., Larsen, K.G.: The fork calculus. Proc. 20th ICALP, LNCS 700 (1993) 
544-557 

Koo, R., Toueg, S.: Effects of message loss on termination of distributed protocols. Inf. 
Proc, Letters, 27 (1988) 181-188 

Lynch, N.A.: Distributed Algorithms, Morgan Kaufmann Publishers (1996) 

van der Meyden, R., Moses, Y.: On refinement and temporal annotations, this volume. 

Morgan, C.: Programming from Specifications - 2nd ed. Prentice Hall (1994) 

Stomp, F., de Roever, W.P.: A principle for sequential reasoning about distributed 
systems. Form. Asp. Comp., 6(6) (1994) 716-737 
Zweirs, J., Janssen, W.: Partial-order based design of concurrent systems. Proc. REX 
Symp. “ A decade of concurrency”, J. de Bakker, W. P. de Roever, G. Rozenberg 
eds., LNCS 803 (1994) 622-684 




Challenges in the Verification of Electronic 
Control Units 



Werner Damm 

OFFIS, University of Oldenburg 
Werner. Damm@Informatik. Uni-Oldenburg. DE 

Electronic Control Units control our cars, airplanes, trains, and other safety 
critical systems. The key motivation to maintain high safety standards in the 
light of increasing complexity as well as the need to reduce development costs, in 
particular time spent in testing, have been driving forces in promoting the use of 
formal techniques in software requirement specifications as well as during design 
and validation of software. As a result of this drive and the growing maturity 
of the employed verification tools, formal techniques have found their way into 
industrial design flows, such as the use of the B-method in Matra-Transport, and 
the use of the Sternol Verification Environment based on Prover at Adtranz Sig- 
naling Sweden. We see an increased pressure on the design process for on-board 
control software to move towards a formally based process, a central prerequi- 
site being the introduction of a model-based development process. This in itself 
constitutes already a significant shift. The step to model-based design proces- 
ses has to a somewhat larger extent already been taken in both avionics and 
automotive, where tools like STATEMATE^, Mathworks^, MatrixX^, Scade^, 
ASCET^ are routinely used at different stages in the development process for 
control software. E.g. Aerospatial uses the Scade tool to generate airborne soft- 
ware and the induced cost benefits. The same concern about safety has caused 
companies like Boeing and British Aerospace to also asses the use of formal 
verification methods. Similarly, in automotive, the incentive to reduce develop- 
ment costs by letting model-checking catch errors early on in the development 
process, or the use of model-checking to create a golden reference model in the 
manufacturer-supplier chain, has been a major motivation to investigate the use 
of model-checking based verification techniques. 

The talk surveys the state of the art in employing verification techniques 
in the above application domains, stressing the role of such techniques in a 
model based design process. The technical focus of the talk will be on recent 
advances in model-checking, allowing to integrate a limited degree of first- order 
reasoning into symbolic model-checking. The talk will also present evaluation 
results on using SAT based methods in connection with bounded model checking 
on representative industrial designs. 



^ a registered trademark of I-Logix Inc. 

^ a registered trademark of TheMathworks, Inc 
® a registered trademark of ISI Inc 
a registered trademark of Verilog SA 
® a registered trademark of ETAS GmbH 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 18-18, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




Scaling up Uppaal 

Automatic Verification of Real-Time Systems using 
Compositionality and Abstraction 



Henrik Ejersbo Jensen, Kim Guldstrand Larsen, and Arne Skou 

BRIGS**, Aalborg University, Denmark 
{ejersbo ,kgl , ask}@cs . auc .dk 



Abstract. To combat the state-explosion problem in automatic verifi- 
cation, we present a method for scaling up the real-time verification tool 
Uppaal by complementing it with methods for abstraction and composi- 
tionality. We identify a notion of timed ready simulation which we show 
is a sound condition for preservation of safety properties between real- 
time systems, and in addition is a precongruence with respect to parallel 
composition. Thus, it supports both abstraction and compositionality. 
We furthermore present a method for automatically testing for the exi- 
stence of a timed ready simulation between real-time systems using the 
Uppaal tool. 



1 Introduction 

Since the basic results by Alur, Courcoubetis and Dill [2] on decidability of 
model-checking for timed automata, a number of tools for automatic verification 
of hybrid and real-time systems have emerged [19,9,6]. These tools have by now 
reached a state, where they are mature enough for application on industrial 
development of real-time systems as witnessed by a number of already carried 
out case-studies [10,16,13,15,7]. Despite this success, the state-explosion problem 
is a reality^ which prevents the tools from ever^ being able to provide fully 
automatic verification of arbitrarily large and complex systems. Thus, to truely 
scale up, the automatic verification offered by the tools should be complemented 
by other methods. 

One such method is that of abstraction. Assume that SYS is a model of some 
considered real-time system, and assume that we want some property (p to be 
established, i.e. SYS \= p. Now, the model, SYS, may be too complex for our 
tools to settle this verification problem automatically. The goal of abstraction 
is to replace the problem with another, hopefully tractable problem ABS ]= p, 
where ABS is an abstraction of SYS being smaller in size and less complex. This 

** BRIGS - Basic Research in Gomputer Science - is a basic research centre funded by 
the Danish government at Aarhus and Aalborg University 
^ Model-checking is either EXPTIME- or PSPAGE-complete depending on the expres- 
siveness of the logic considered. 

^ unless we succeed in showing P=PSPAGE 



M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 19-30, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




20 



H.E. Jensen, K.G. Larsen, and A. Skou 



method requires the user not only to supply the abstraction but also to argue 
that the abstraction is safe in the sense that all relevant properties established for 
ABS also hold for SYS; i.e. it should be established that SYS < ABS, for some 
property-preserving relationship < between models^. Unfortunately, this brings 
the problem of state-explosion right back in the picture because establishing 
SYS < ABS may be as computationally difficult as the original verification 
problem SYS ^ <p. 

To alleviate the above problem, the method of abstraction may be combi- 
ned with that of compositionality. Here, compositionality refers to principles 
allowing properties of composite systems to be inferred from properties of their 
components. In particular we want to establish the safe abstraction condition, 
SYS < ABS, in a compositional way, that is, assuming that SYS is a composite 
system of the form SYSi || SYS 2 , we may hope to find simple abstractions ABSi 
and ABS 2 such that: 



SYSi < ABSi and SYS 2 < ABS 2 (1) 

Provided the relation < is a precongruence with respect to the composition 
operator ||, we may now complete the proof of the safe abstraction condition by 
establishing: 



ABSi II ABS2 < ABS (2) 

This approach nicely factors the original problem SYS < ABS into the smal- 
ler problems of (1) and (2), and may be applied recursively until problems small 
enough to be handled by automatic means are reached. 

The method of abstraction and compositionality is an old-fashion recipe with 
roots going back to the original, foundational work on concurrency theory. For 
a nice survey on the history of compositional proof systems see [11]. Due to 
the reality of the state-explosion problem in automatic verification, there has 
recently been a renewed interest in applying the principles of abstraction and 
compositionality in combination with automatic model-checking [5,21,20,8,14] 

The purpose of this paper is to present a tool-supported method for verify- 
ing properties of real-time systems using abstraction and compositionality. The 
tool we apply is the real-time verification tool Uppaal [19] developed jointly by 
BRIGS at Aalborg University and Department of Computing Systems at Upp- 
sala University. Uppaal provides support for automatic verification of safety 
properties of systems modelled as networks of timed automata communicating 
over (urgent) channels and shared integer variables. A fundamental relations- 
hip between timed automata preserving safety properties — and hence useful in 
establishing safe abstraction properties — is that of timed simulation. However, 
in the presence of urgent communication and shared variables, this relationship 
fails to be a precongruence, and hence does not support compositionality. A 

i.e. A < B and B \= (p should imply that A\= tp- 



3 




Scaling up Uppaal 



21 



main contribution of this paper is to identify a notion of timed ready simulation 
supporting both abstraction and compositionality. 

Having identified the notion of timed ready simulation as the fundamental 
condition for property preservation between timed automata, there still remains 
the problem of how to establish such a relation in practice. In this paper, we 
provide a method for automatically testing for the existence of a timed ready 
simulation between timed automata using reachability analysis. Thus Uppaal 
may be applied for such tests. Assume that < is the simulation relation and that 
we want to check if SYS < ABS. Our testing method then prescribes (1) the 
construction of a test automaton Tabs for ABS and (2) a check of whether a 
certain reject node can be reached in the composition SYS || Tabs- If a reject 
node can be reached, SYS ^ ABS; otherwise SYS < ABS. The automaton 
Tabs will continuously monitor for compliance with ABS and may be seen as 
a generalization of the early complement construction for deterministic timed 
automata [4,3] to take into account urgent channels and shared variables. 

In Section 2 we present the timed automaton model and the fundamental 
notions of timed simulation and timed reachability. Section 3 presents our notion 
of timed ready simulation, which is shown to be a precongruence with respect to 
parallel composition. Section 4 presents our method for testing for the existence 
of a timed ready simulation, and Section 5 concludes. 



2 Timed Automata 

Semantically we model shared variable real-time systems using a standard label- 
led transition system model extended with capabilities for describing real-time 
behavior and communication via shared multi-reader/multi- writer variables. Our 
transition system model has two types of labels: atomic actions and delays, re- 
presenting discrete and continuous changes of real-time systems. We will assume 
that A is a universal set of actions used for synchronization between transition 
systems and C A is a special subset of urgent actions used to enforce im- 
mediate synchronization among transition systems. We use a to range over A. 
We assume that A is equipped with a mapping ~ : A ^ A such that a = a for 
every a G A. Moreover, we assume that for any a £ Au, a £ Au as well. We also 
assume the existence of a special internal action r distinct from any action in 
A. We write At to denote AU {t} and we use /r range over At- We use T> to 
denote the set of delay actions {e{d) \ d £ 7Z>o} where 7Z>o denotes the set of 
non-negative real numbers. 

We consider transition systems capable of communicating over a universal 
set V of shared integer variables. A transition system comes equipped with a 
signature S which is a tuple {R, W, IW) of subsets of V. The signature descri- 
bes sets of shared variables that are readable {R), writable (lU), and internally 
writable {IW) by the transition system. We do not require sets R and W to be 
disjoint. Set IW is a subset of W consisting of variables writable only by the 
system itself but readable by all systems. 




22 



H.E. Jensen, K.G. Larsen, and A. Skou 



We assume that any state s of a transition system at least provides a value 
for each integer variable v G V.'^ By slight misuse of function notation we denote 
this value by s(u). We extend this notation in the obvious way to subsets X of 
V. We define s[X] = s'[Al] iff s(u) = s'{v) for all v G X. 

In order to represent the effect of an environment upon an “open” transition 
system, we use a special set of environment actions E = {£(A1) | X C V}. An 
action e{X) represents the environment updating the variables in X. We let 
V range over Ar U £. We now define our notion of timed transition system as 
follows. 

Definition 1. A timed transition system (TTS) is a tuple T = {S,so,X , — >) 
where S is a set of states, sq G S is the initial state, E = {R,W,IW) is a 
signature, and — > C S x {Ar A £\JT>) x S is a transition relation. 

For any state s of a TTS we write s — ^ iff there exists a state s' such that 
s — > s'. We require — > to satisfy the well-known time properties of (1) time 
determinism, (2) time additivity and (3) zero-delay [1]. Also, the environment 
should have the freedom to update any variable outside IW. Thus, (4) for all 

a: C V\IW : s s' iff s'[X] = s[X], and (5) if X n /W yf 0 then s . 

In Uppaal, timed transition systems are described syntactically by timed au- 
tomata [4,19]. A timed automaton is a standard automaton extended with finite 
collections of real-valued clocks and integer-valued data variables. We consider 
automata where actions are taken from the infinite set Ar and data variables 
are integer variables in the global set V. The automaton model considered here 
is a slight variation of the one supported by Uppaal. All results of this paper 
however do generalize to the model of Uppaal. 

Each automaton has a local set C of clock variables. For any subset R C V, 
we use G{C, R) to stand for the set of guards g generated as a conjunction of 
logical constraints g^ over clock variables in C and g^ over data variables in R.^ 
To manipulate clock and data variables we use reset sets. We denote by R{C) 
the set of all resets of clocks. For any subsets W,RC V, we denote by R{W, R) 
the set of all assignments w := e where w G W and e is an expression only 
depending on variables in R. We let R{C, W, R) = R{C) U R{W, R). 

Definition 2. A timed automaton A is a tuple {N,lo,C, I , E, E) , where N is a 
finite set of loeations, Iq G N is the initial location, C is a finite set of real-valued 
clocks, I is an invariant function assigning a C predicate I{1) to each location I, 
E = {R, W, IW) is a signature, and E C N x G{C, R) x Ar x x N is 

a set of edges. 

We write I a I', or simply I I' when A is clear from the context, to 
denote that {I, g, p,, r, I') is an edge of A. 

A state of an automaton A is a pair {{l,w),v) where I is a node of A, w 
is a clock assignment for A, and v is a data variable assignment. The initial 

State s could yield other interesting information such as program location etc. 

® We leave the syntactic restrictions of Qc and unspecified since they are not im- 
portant here. 




Scaling up Uppaal 



23 



state of A is ((Ig, wo),vo), where Iq is the initial node of A, wg is the initial clock 
assignment mapping all clock variables to 0, and Vg is the initial data assignment 
that maps all variables in to 0. 

Definition 3. The operational semantics of a timed automaton A is given by 
the TTS, Ta = {S,sg,S, — >), where S is the set of states of A, sg is the initial 
state of A, E = {R, W, IW) is the signature of A, and — > is the transition 
relation defined as follows: 

- {{l,w),v) {{l',w'),v') iff 3r,g. I I' A g^w) A gy{v) A 

w' = rc{w) A v' = r„(v) 

- {{l,w),v)"^ {{l',w'),v') iff I' A w'_ = w A XnlW = (h A 

v'[X]=v[X] 

- {{I, w),v) {{I', w'),v') iff I' = I Aw' = w + d A w'\= Iff) A 

v' = V 

Uppaal supports verification of simple reachability properties of timed auto- 
mata, in particular whether certain locations and constraints on clock and data 
variables are reachable from an initial state. Uppaal only allows for reachabi- 
lity analysis of closed transition systems. A system R with internally writable 
variables IW is closed, if all variables in V are internal, i.e. IW = V, and if the 
environment cannot synchronize with R, i.e. R = R\A where \ is the standard 
action restriction operator. 

For any states s and s' of a TTS, we write s s' iff there exists a finite 
transition sequence s = sq si s„ = s' such that for all i € 

{1, n}, ai = T or e V, and d = Jffdi \ Oi = e{di)}. 



Definition 4. Let R he a TTS. We say that a state s of R is reachable in time 
d, written R ^ s, iff sg s, where sg is the initial state of R. 



For A a timed automaton, we say that state s is reachable in time d, written 

. e(d) . e(d) 

A S, it /A S. 

We now want to define a condition between two timed automata that pre- 

T ^ 

serves timed reachability from one system to the other. Let — > denote the 
reflexive and transitive closure of — For any states s and s' of a TTS, we 
write s s' iff there exists s",s'" such that s — >■ s" s'" — >■ s'. Also, 
let fi = e(0) it pi = T and ft = yi, otherwise.® 

In the following let 7i and ?2 be any two TTS’s with signatures Ei = 
(i?i, lUi, /lUi) and E 2 = (i? 2 , kF 2 , /IU 2 ), respectively. We let Vi = i?i U lUi 
and V 2 = i ?2 U W 2 . We write sta{Ri) and stafTz) to denote the set of states 
of 7i and 72, respectively. Let sg and tg denote the initial states of R and R, 
respectively. 

® For any state s, s s (zero-delay property). 




24 



H.E. Jensen, K.G. Larsen, and A. Skou 



Definition 5 . Let R be a relation from sta{Ti) to sta{T2)- We say that R is a 
timed simulation from T\ to T2, written 71 < ?2 via R, provided (sq) ^o) G R find 
for all (s,t) G R, s\V2] = t\V2] and 

— s — ^ s' 3 1 '. t t' A (s', t') G R 

— s s' 3 1 ' . t t' A (s', t') G R 



Timed simulation is a sound condition for preservation of timed reachability 
in the following sense. 



Theorem 1 . //71 < ?2 and T\ s, 

s[V2]=t[V2]. 



then there exists t such that ?2 



e{d) 



t and 



Thus, any invariance property of 7i only referring to variables in V 2 may be 
immediate concluded provided the same invariance property holds in the more 
abstract system 72- 



3 Timed Ready Simulation 

In order to be of practical use, any verification methodology must be able to 
support compositional reasoning. In our setting this means that we want timed 
simulation to be a precongruence with respect to parallel composition of timed 
automata. As we will see in this section, the timed simulation relation from De- 
finition 5 is not a precongruence. In this section we strengthen the notion of 
timed simulation to a new kind of simulation called timed ready simulation and 
we show that this relation is indeed a precongruence. Our compositionality prin- 
ciple is, to the best of our knowledge, the first to allow compositional verification 
of real-time systems with urgency and with shared multi-reader/multi- writer va- 
riables. We begin this section by defining the notion parallel composition. We 
will say that two TTS’s are compatible provided that none of them can write 
into variables that are declared as internally writable by the other part. In the 
following let 7i and T2 be TTS’s with signatures Si = {Ri,Wi, IWi) and S2 = 
(i? 2 , W 2 ,IW 2 ), respectively. Also, let Vi = i?i U Wi and V 2 = R 2 ^J W 2 - 

Definition 6 . We say that Si and S2 are compatible iff IWi fl W2 = IW2 H 
Wi = 0. Moreover, 71 and 71 are compatible iff Si and S2 are compatible. 

When composing two TTS’s we are allowed to “hide” some variables, i.e. to 
make them internally writable. We define a signature composition as follows. 

Definition 7 . A signature S = {R, W, IW) is said to be a composition of Si 
and S2iffR = RiA R2, W = WiA W2, and IW 2 IWi U IW2. 

We can now define the notion of parallel composition. Let s be any state of 
a TTS. We can then consider s as a pair {p,v) where v is the projection of s 
onto variables of the global set V and p is the projection onto elements not in 
V . For simplicity, we will use this presentation style in the following. Let sq and 
to be the initial states of 71 and 71, respectively. We let pi^ and p 2 ,o denote the 
projections of sq and to, respectively, onto elements not in V. Also, let vq denote 
the assignment mapping all elements of IG to 0. 




Scaling up Uppaal 



25 



Definition 8. Assume that 7i and T 2 are compatible and let S = {R,W,IW) 
he a composition of Ei and Z' 2 - The parallel composition T1WT2 with signature 
S is the TTS, {S,Sq, — >), where, 

- S = {{{pi,P 2 ),v) I {pi,v) G sta{Ti) A {P2,v) G sto(7^)}, 

- So = ((pi,o,P 2 ,o),wo), and 

- — > is defined by the rules in Figure 1. 




Fig. 1. Rules defining the transition relation — > in 7i || ?2 



Assume that A and B are timed automata. We say that A and B are com- 
patible iff Ta and 7s are compatible, and we define A || i? as the TTS Ta\\Tb- 
We extend the above notions inductively to hold for compositions of automata. 

We now consider how the notion of timed simulation needs to be strengthened 
in order to be guaranteed a precongruence. First of all, the existing definition 
of timed simulation does not take into account the synchronization capabilities 
of the related systems. Second, the definition does not take into account the 
possible effects of an environment on the related systems. And third, urgent 
actions can have some effect on the delay properties of a composition that need 
to be taken into account. Figure 2 shows an example illustrating the above 
mentioned problems. 





26 



H.E. Jensen, K.G. Larsen, and A. Skou 




Fig. 2. Example illustrating that < is not a precongruence. Here A < B but A || G ^ 
H II G due to the following problems: (1) A and B do not react in the same way to effects 
of their environment G. When automaton G sets data variable i to 1 it enables an a 
action in A but not in B. Thus, A || G can synchronize on a and a thereby set i = 2; 
whereas B || G cannot synchronize. (2): Urgent action u of B can preempt delaying 
in B II G but not in A || G. In A || G a delay of more than 3 time units is possible, 
thereby enabling a transition setting i = 3. In B || G urgent synchronization on u and 
u preempts the possibility of an initial delay and thus the possibility of setting i = 3. 



We now strengthen the definition of timed simulation in order to remedy 
the above mentioned problems. We will say that ?2 is a valid abstraction for 7i 
provided R 2 Q Ri, W 2 Q W\ , and IW 2 Q IW\ . If ?2 is a valid abstraction for 7i 
then for any X QV such that X fl IW\ = 0, we also have X fl IW 2 = 0. Thus, 
any environment “valid” for 7i is also “valid” for ? 2 . 

Definition 9. Assume that T 2 is a valid abstraction for 7i. Let R be a relation 
from sta{Ti) to sta { 72 )- We say that R is a timed ready simulation from Ti to T 2 > 
written 71^72 via R, provided (so,to) € R and for all (s,t) G R, s[V 2 ] = t[V 2 ] 
and 

— s s' 3 1'. t t' A (s', t') G R 

— s s' 3 1' . t t' A (s', t') G R 

— s"^ s' At t' A s'[X] = t'[X] ^ (s', t') G R 

— t — > A a G Au => s — > 

We lift the notion of timed ready simulation to timed automata just as we did 
for timed simulation. We now state our wanted theorem saying that ^ is indeed 
a precongruence and hence supports compositional reasoning. 

Theorem 2. Let A || G and B \\ D be timed automata compositions such that 
IWg^p C Also, assume that B and D are both r-free. Lf 

1. A A B and C < D, and 

2. Va n Vd C Vb and Vc n Vb C Vb, 

then A II G A B\\D 

Condition 2 says that variables visible for D (and hence G) cannot be removed 
by the abstraction from A to B. Analogously, variables visible by B (and hence 
A) cannot be removed by the abstraction from G to 77. 




Scaling up Uppaal 



27 



4 Testing for Timed Ready Simulation 

In this section we show that the problem of checking whether A < B, for timed 
automata A and B, is equivalent to that of testing whether a special reject node 
of the composition A\\Tb is reachable. Where Tb is a, special construction called 
the test automaton for B. Hence, we can use Uppaal to automatically check 
whether A < B. We begin by introducing our general notion of testing a timed 
automaton. This notion follows closely the one presented in [1] 

Definition 10. A test automaton is a tuple T = {N, Nt,Iq,C, I, E, E) where 
N,lo,C, I , E, and E are as in Definition 2, and Nt C N, is the set of reject 
nodes. 

Intuitively, a test automaton T interacts with a tested system, represented by 
a composition of timed automata, by communicating with it. The dynamics of 
the interaction between the tester and the tested system is described by the 
parallel composition of the automaton composition that is being tested and of 
T. We now define failure and success of a test as follows. For A an automaton 
composition and I a location of A, we write 4 / if there exists a state s of 4 

with location component I, and a delay d such that 4 ^ s. 

Definition 11. Let A be an automaton composition with set of reject nodes Nt 
and let T he a test automaton. We say that A fails the T-test iff A \\T I for 
some I € Nt. Otherwise, we say that A passes the T-test. 

The following theorem shows that it is possible to check for the existence of a 
timed ready simulation between two automata compositions, using the notion of 
testing introduced above. We will say that a timed automaton B is deterministic 
provided that for any states s, s', s" of Tb, if s s' and s s" then s' = s". 

Theorem 3. Let A he a timed automata composition, and let B he a r-free and 
deterministic timed automaton. Then there exists a test automaton Tb such that 
A E B iff A passes the TB-test. 

In Figure 3 we have shown how the test automaton Tb of Theorem 3 is con- 
structed node- wise by considering each node I oi B. For any data variable i G Vb 
we assume that i is a fresh variable in Vtb- Guards and reset operations in Tb 
are identical to the ones in B except for substitution of i for any data variable 
i. For simplicity we have shown the test construction for special case where all 
variables of Vb are internally writable. The construction easily generalizes to the 
case where some variables are not internally writable. 

5 Conclusion 

In this paper we have presented a tool-supported method for verifying properties 
of real-time systems using abstraction and compositionality. In order to support 
both abstraction and compositionality, we have identified a notion of timed ready 




28 



H.E. Jensen, K.G. Larsen, and A. Skou 





Fig. 3. Automaton B and its test automaton Tb- Actions {am, ■ ■ ■ S'l’e assumed 

to form the urgent subset of {ai, . . . ,ak\- The subgraph of Tb induced by nodes I, 
lui, ■ ■ ■ lur , lu is used to test for the backwards- match requirement of urgent actions 
in the ^ definition. If urgent action am of B is enabled in I, Tb enables the action 
oJT and resets clock Xu- Now, if the automaton that B is supposed to simulate does 
not have an am action immediately enabled, time can pass and thereby allow Tb to 
enter the reject node (sad face). The subgraph induced by the reject node and node I 
tests that the automaton that B must simulate cannot delay beyond the node invariant 
I{1) of 1. The transitions with guard \j i ^ i test that corresponding states in the 
simulation relation agree on variables of Vb- The remaining part of Tb test for the 
forwards-match of all actions. The r-transition from node I to node T is required to 
allow time to pass in node I in the presence of urgent actions. 




Scaling up Uppaal 



29 



simulation between timed automata. We have shown that timed ready simula- 
tion is a precongruence and hence does support compositional reasoning. To the 
best of our knowledge, this is the first compositionality principle for timed auto- 
mata with both urgent channels and shared multi-reader/multi- writer variables. 
Based on the work of Larsen [17,18] on context-dependent bisimulation, we are 
currently extending our notion of timed ready simulation to a context-dependent 
timed ready simulation, which is an extension of the timed ready simulation pa- 
rameterized with an assumption about environments. This extension will allow 
for assume- guarantee style reasoning. 

We have further provided a method for automatically testing for the existence 
of a timed ready simulation between timed automata using reachability analy- 
sis. The reachability analysis can be automatically performed by the real-time 
verification tool Uppaal. 

The results of this paper have been applied by us in the verification of a 
large industrial design - the Bang & Olufsen (B&O) audio/video power control- 
ler. This system is supposed to reside in an audio/video component and control 
links to neighbor audio/video components such as TV, VCR and remote-control. 
In particular, the system is responsible for the powering up and down of the 
components in between the arrival of data, and in order to do so, it is essential 
that no link interrupts are lost. In an earlier work [12] we successfully verified a 
scaled-down version of the full protocol. However, the size of the full protocol 
model is so large that Uppaal immediately encounters the state-explosion pro- 
blem in a direct verification (on a 1 GByte SUN computer). By application of 
our developed compositionality and testing results we are able to carry through 
a verification of the full protocol model within a few seconds. 

References 

1. Luca Aceto, Augusto Burgueno, and Kim G. Larsen. Model checking via reach- 
ability testing for timed automata. In Bernhard Steffen, editor, Proc. 4th Int. 
Conference on Tools and Algorithms for the Construction and Analysis of Systems 
(TACAS’98), volume 1384 of Lecture Notes in Computer Science, pages 263-280. 
Springer, 1998. 

2. R. Alur, C. Courcoubetis, and D. Dill. Model-checking for Real-Time Systems. 
In Proc. of Logic in Computer Science, pages 414-425. IEEE Computer Society 
Press, 1990. 

3. R. Alur and D. Dill. Automata for Modelling Real-Time Systems. In Proc. of 
ICALP’90, volume 443, 1990. 

4. R. Alur and D. Dill. A theory of timed automata. Theoretical Computer Science, 
126:183-236, 1994. 

5. R. Alur, T. A. Henzinger, F. Y. C. Mang, S. Qadeer, S. K. Rajamani, and S. Ta- 
siran. Mocha Modularity in Model Checking. In Computer Aided Verification, 
Proc. 10th Int. Conference, volume 1427 of Lecture Notes in Computer Science, 
pages 521-525. Springer Verlag, 1998. 

6. R. Alur, T.A. Henzinger, and P.-H. Ho. Automatic symbolic verification of em- 
bedded systems. IEEE Transactions on Software Engineering, pages 22:181-201, 
1996. 




30 



H.E. Jensen, K.G. Larsen, and A. Skou 



7. Johan Bengtsson, David Griffioen, Kare Kristoffersen, Kim G. Larsen, Fredrik 
Larsson, Paul Pettersson, and Wang Yi. Verification of an Audio Protocol with 
Bus Gollision Using Uppaal. In Proceedings of CAV’96, volume 1102 of Lecture 
Notes in Computer Science. Springer Verlag, 1996. 

8. D. Dams. Abstract Interpretation and Partition Refinement for Model Checking. 
PhD thesis, Eindhoven University of Technology, 1996. 

9. C. Daws, A. Olivero, S. Tripakis, and S. Yovine. The tool kronos. In Hybrid 
Systems III, Verification and Control, volume 1066 of Lecture Notes in Computer 
Science. Spinger Verlag, 1996. 

10. C. Daws and S. Yovine. Two examples of verification of multirate timed automata 
with Kronos. In Proc. of the 16th IEEE Real-Time Systems Symposium, pages 
66-75, December 1995. 

11. Willem-Paul de Roever. The need for compositional proof systems: A survey. In 
Willem-Paul de Roever, Hans Langmaack, and Amir Pnueli, editors, Compositio- 
nality: The Significant Difference, International Symposium, COMPOS’97, volume 
1536 of Lecture Notes in Computer Science, pages 1-22. Springer- Verlag, 1997. 

12. K. Havelund, K. Larsen, and A. Skou. Formal Verihcation of a Power Controller 
Using the Real-Time Model Checker Uppaal. In Joost-Pieter Katoen, editor. For- 
mal Methods for Real-Time and Probabilistic Systems, 5th International AMAST 
Workshop, ARTS’99, volume 1601 of Lecture Notes in Computer Science, pages 
277-298. Springer Verlag, 1999. 

13. Pei-Hsin Ho and Howard Wong-Toi. Automated Analysis of an Audio Control 
Protocol. In Proc. of CAV’95, volume 939 of Lecture Notes in Computer Science. 
Springer Verlag, 1995. 

14. Henrik Ejersbo Jensen. Abstraction- Based Verifieation of Distributed Systems. PhD 
thesis, Aalborg University, Institute for Computer Science, Aalborg, Denmark, 
1999. 

15. Henrik Ejersbo Jensen, Kim G. Larsen, and Arne Skou. Modelling and Analysis of 
a Collision Avoidance Protocol Using SPIN and UPAAL. In J-C. Gregoire, G.J. 
Holzmann, and D.A. Peled, editors. Proceedings Second Workshop on the SPIN 
Verification System, American Mathematical Society, DIMACS/39, 1996. 

16. Kare Jelling Kristoffersen. Compositional Verification of Concurrent Systems. PhD 
thesis, Aalborg University, Department of Computer Science, Institute for Electro- 
nic Systems, Aalborg, Denmark, August 1998. 

17. K.G. Larsen. Context-Dependent Bisimulation Between Processes. PhD thesis. 
University of Edinburgh, Mayheld Road, Edinburgh, Scotland, 1986. 

18. K.G. Larsen. A context dependent bisimulation between processes. Theoretical 
Computer Science, 49, 1987. 

19. Kim G. Larsen, Paul Pettersson, and Wang Yi. Uppaal in a Nutshell. Int. Journal 
on Software Tools for Technology Transfer, 1(1-2):134-152, October 1997. 

20. C. Loiseaux, S. Graf, J. Sifakis, A. Bouajjani, and S. Bensalem. Property Preser- 
ving Abstractions for the Verihcation of Concurrent Systems. Formal Methods in 
System Design, pages 6:11-44, 1995. 

21. K. L. McMillan. Verihcation of an Implementation of Tomasulo’s Algorithm by 
Compositional Model Checking. In Computer Aided Verification, Proc. 10th Int. 
Conference, volume 1427 of Lecture Notes in Computer Science, pages 110-121. 
Springer Verlag, 1998. 




Decidable Model Checking of Probabilistic 
Hybrid Automata 



Jeremy Sproston* 

School of Computer Science, University of Birmingham, 
Birmingham B15 2TT, United Kingdom. J.Sproston@cs.bham.ac.uk 



Abstract. Hybrid automata offer a framework for the description of 
systems with both discrete and continuous components, such as digital 
technology embedded in an analogue environment. Traditional uses of 
hybrid automata express choice of transitions purely in terms of non- 
determinism, abstracting potentially significant information concerning 
the relative likelihood of certain behaviours. To model such probabilistic 
information, we present a variant of hybrid automata augmented with 
discrete probability distributions. We concentrate on restricted subclas- 
ses of the model in order to obtain decidable model checking algorithms 
for properties expressed in probabilistic temporal logics. 



1 Introduction 

Many systems, such as embedded controllers, can be modelled in terms of in- 
teraction between discrete and continuous components. Examples of such hybrid 
systems include robots, medical equipment and manufacturing processes. Tra- 
ditionally, formal techniques for the description of hybrid systems express the 
system model purely in terms of nondeterminism. However, it may be desira- 
ble to express the relative likelihood of the system exhibiting certain behaviour. 
This notion is particularly important when considering fault-tolerant systems, 
in which the occurrence of the discrete event malfunction is less likely than 
the event correct -action. Furthermore, it may be appropriate to model the li- 
kelihood of an event changing with respect to the continuous behaviour of the 
environment; for example, malfunction may become more likely if the system 
is operating at extreme temperatures or at high speeds. We may also wish to 
have a model checking algorithm for verifying automatically such hybrid systems 
against temporal logic properties referring explicitly to likelihoods. The feasibi- 
lity of such verification methods is suggested by the successful development of 
model checking algorithms and tools both in the domain of hybrid systems [1 1] 
and that of discrete, finite-state probabilistic-nondeterministic systems [8]. 

Therefore, we extend the model of hybrid automata [1], a framework for the 
description of hybrid systems, with discrete probability distributions. This ap- 
proach is inspired by the work of [16], which presents firstly a model of timed 
automata (a highly restricted subclass of hybrid automata) extended with such 

* Supported in part by the EPSRC grant GR/N22960. 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 31-45, 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 




32 



J. Sproston 



distributions, and secondly a decidable algorithm for verifying instances of this 
model against formulae of a probabilistic temporal logic. Our new model, pro- 
babilistic hybrid automata, differs from traditional hybrid automata in that the 
edge relation of the graph representing the system’s discrete component is both 
nondeterministic and probabilistic in nature. More precisely, instead of making 
a purely nondeterministic choice over the set of currently enabled edges, we 
nondeterministically choose amongst the set of enabled discrete probability dis- 
tributions, each of which is defined over a set of edges. Then, a probabilistic 
choice as to which edge to take according to the selected distribution is perfor- 
med. Although probability is defined to affect directly only the discrete dynamics 
of the model, the proposed model would nevertheless be useful for the analysis 
of many systems, such as embedded technology operating according to rando- 
mised algorithms, or the aforementioned fault-tolerant systems, for which an 
appropriate probabilistic hybrid automaton may be obtained given appropriate 
failure specifications of the system’s components. 

A substantial body of work has been devoted to exploring notions of decida- 
bility of non-probabilistic hybrid automata, particularly with regard to problems 
which underly model checking procedures. These problems are usually addressed 
by utilising refinement relations such as simulation and bisimulation in order to 
introduce notions of equivalence and abstraction on the infinite state space of a 
hybrid automaton. It follows that model checking can then be performed not on 
the original, infinite state space, but on a quotient induced by an equivalence rela- 
tion; therefore, if the number of equivalence classes for a given hybrid automaton 
is finite, then model checking is decidable. However, such finitary quotients exist 
only for certain classes of model [12,18]. In particular, the rectangular automata 
of [12] feature differential inequalities which describe the continuous evolution 
of system variables taking place within piecewise-linear, convex envelopes, and 
can be used to state, for example, that a system variable increases between 1 
and 3 units per second. Another such class is that of o-minimal hybrid auto- 
mata [18], which permit expressive (albeit deterministic) continuous behaviour, 
and feature restricted discrete transitions. The remit of this paper is to extend 
these classes with discrete probability distributions, and to explore the way in 
which established model checking techniques for probabilistic-nondeterministic 
systems [7,6] may be used to verify such models against probabilistic temporal 
logic specifications. This provides us with a means to verify probabilistic ex- 
tensions of rectangular or o-minimal automata against properties such as ‘soft 
deadlines’ (for example, a response to a request will be granted within 5 seconds 
with probability at least 0.95), or those which refer to the probability of malfun- 
ction or component failure (such as, with probability 0.999 or greater, less than 
1 litre of coolant leaks from the nuclear reactor before an alarm is sounded). 

The paper proceeds by first introducing probabilistic hybrid automata in 
Section 2. Section 3 explains how their semantics can be presented in terms 
of infinite-state, nondeterministic-probabilistic transition systems. Strategies for 
model checking probabilistic rectangular automata and probabilistic o-minimal 
automata are presented in Section 4. 




Decidable Model Checking of Probabilistic Hybrid Automata 



33 



2 Probabilistic Hybrid Automata 

The purpose of this section is to present a model for probabilistic hybrid sy- 
stems using the framework of hybrid automata, based on the probabilistic timed 
automata of [16]. For a set F, a (discrete probability) distribution on F is a 
function /x : F — >• [0, 1] such that fi{y) > 0 for at most countably many y £ Y 
and t^iy) — 1- Dist(F) to denote the set of all distributions on 

F. Given a distribution /x on a set F, let support(/x) be the support of /x; that 
is, the set of elements y of F such that /x(y) > 0. If F contains one element, 
then a distribution over F is called a Dirac distribution, and is denoted D{y) 
where F = {y}. Next, let X = {x\, ...,Xn} be a set of real-valued variables. We 
write a £ IR” for a vector of length n which assigns a valuation G IR to each 
variable Xi £ X . We fix a finite set AP of atomic propositions. 

A probabilistic hybrid automaton H = {X,V,L,init,inv,flow,prob,{pre^)vev) 
comprises of the following components: 

Variables, fb is a finite set of real-valued variables. 

Control modes. F is a finite set of control modes. 

Labelling function. The function T : F — >■ 2^^ assigns a set of atomic propo- 
sitions to each control mode. 

Initial set. The function init : V -£■ 2®^ maps every control mode to an initial 
set in IR". 

Invariant set. The function inv : V — >■ 2® maps every control mode to an 
invariant set in IR". 

Flow inclusion. The partial function flow : V x IR" — >■ 2®- maps control 
modes and valuations to a flow inclusion in IR", and is such that, for each 
V £V and a £ inv(v), the set flow(v,a) is defined. 

Probability distributions. The function pro& : F —>■ P/„(Dist(Fx 2^^ x2‘'^)) 
maps every control mode to a finite, non-empty set of distributions over the 
set of control modes, and the powersets of both IR" and X. Therefore, each 
control mode v £ V will be associated with a set of distributions denoted by 
prob{v) = ...,/x™} for some finite m > 1. 

Pre-condition sets. For each v £ V, the function pre„ : prob{v) -£ 2®^ 
maps every probability distribution associated with a control mode to a 
pre-condition set in IR". 

For simplicity, and without loss of generality, we assume that the initial point 
is unique; that is, for a control mode vq £ V, the initial condition init{vo) is a 
singleton in IR", and init{y') = 0 for all other v' £V \ {wq}- Therefore, control 
of the model commences in a mode vq with the variable valuation given by 
init{vo). When control of a probabilistic rectangular automaton is in a given 
mode V £ V , the values of the real-valued variables in X change continuously 
with respect to time. Such continuous evolution is determined by the mode’s flow 
inclusion flow; that is, flow{v,a) gives the set of values that the first derivative 
with respect to time ^ of each variable Xi £ X may take in the control mode 
V when the current value of the variables is given by a. A discrete transition, 
henceforth called a control switch, from v to another mode, may take place if the 




34 



J. Sproston 




Fig. 1. The probabilistic hybrid automaton Hi. 



pre-condition pre^(^) of a distribution fi € prob{v) is satisfied by the current 
values of the variables. In such a case, we say that p, is enabled. Conversely, such 
a control switch must take place if the passage of some time would result in 
the current variable values leaving the invariant set inv{v). For simplicity, the 
invariant and pre-condition sets are subject to the assumption that, if allowing 
any amount of time to elapse would result in the departure of the set inv{v), 
then the current point in the continuous state space must be in the pre-condition 
set of at least one distribution in prob{v). 

Given that it has been decided to make a control switch via a particular 
enabled distribution p € prob{v), then a probabilistic choice as to the target 
mode of the switch, and to the discrete changes to the continuous variables, is 
performed. More precisely, with probability p{w,post,X), a transition is made 
to mode w G V with the valuation b, such that b is in the set post C ]R" 
and hi = for every variable Xi G X \ X. Sets such as post are referred 
to as post-conditions, and variable sets such as X are referred to as reset sets. 
Together, post-conditions and reset sets determine the effects that a probabilistic 
hybrid automaton’s control switches have on its continuous variables. These 
sets are subject to the following simplifying assumption: for each v G V , each 
p G prob{v), and each {w,post, X) G support(/x), we have post C inv{w). 

An example of a probabilistic hybrid automaton is given in Figure 1. A 
number of standard conventions concerning the diagrammatic representation of 
hybrid automata are used here (see, for example, [1]). The probabilistic hybrid 
automaton Hi models a process in which data packets are repeatedly sent from 
a sender to a receiver. We explain only the action of a fragment of the model. 
The process measures time according to a drifting clock, which is represented 
by the variable x. When the process is ready to send a packet (control is in 
the mode send), the clock progresses at any rate between ^ and ^ units per 
millisecond, and when it is idle (control is in the mode idle), the clock progresses 
at between | and | units per millisecond. In the former case, the process waits 
until its clock is equal to or greater than 4 before transmitting data, and must 
transmit before the clock exceeds 5. However, when transmission takes place, 
there is a 1% chance of the occurrence of an unrecoverable error (control passes 
to the mode error), as represented by a distribution over the edges from send to 
idle, and from send to error. More precisely, the distribution p G prob(send) is 
such that ^(idle, [0,0], {a;}) = 0.99, ^(error, M, 0) = 0.01 and pre^^„^(p) = [4,oo). 




Decidable Model Checking of Probabilistic Hybrid Automata 



35 



Subclasses of probabilistic hybrid automata. A rectangular inequality over 
X is of the form Xi ~ k, where Xi G X, ~G {<,<,=,>,>} and fc G Q. A 
rectangular predicate over X is a, conjunction of rectangular inequalities over X . 
For any rectangular predicate P, the set of valuations for which P is true when 
each Xi G X is replaced by its corresponding valuation a^ is denoted by |P]] 
(intuitively, |P]] is the set of points in M" that satisfy P). Furthermore, we call 
[[PJ a rectangle, and occasionally refer to such a set as rectangular. A closed 
and bounded rectangle is described as being compact. The set of rectangles over 
X is obtained from the set of rectangular inequalities over X, and is denoted 
Rect{X). The projection of the rectangle Z onto the axis of Xi is denoted by Zi. 

We now introduce a probabilistic extension of rectangular automata [12]. A 
probabilistic rectangular automaton R is a probabilistic hybrid automaton such 
that, for every v G V, the sets inv{v) and flow{v, •) are rectangles, for every p, G 
prob{v), the set pre^(/i) is a rectangle, and, for every (w, post, X) G support(/r), 
the set post is a rectangle. The probabilistic rectangular automaton R is in- 
itialised if, for every pair of modes v,w G V, and every Xi G X for which 
flow{v,-)i yf flow{w, -)i, then if there exists a distribution p, G prob{v) and a 
tuple {w, post, X) G support(/r), we have Xi G X. Intuitively, if the execution of 
a control switch results in a variable Xi changing the condition on its continuous 
evolution, then the value of Xi must be reinitialised. The probabilistic rectan- 
gular automaton R has deterministic jumps if for every v,w G V, p G prob{v), 
and {w,post, X) G support(/i), then the set posti is a singleton for every Xi G X. 
Intuitively, this requirement states that, for every control switch, each variable 
either remains unchanged or is deterministically reset to a new value. A pro- 
babilistic multisingular automaton M is an initialised probabilistic rectangular 
automaton with deterministic jumps such that, for each v G V and for each 
Xi G X, we have flow{v, -)i = k for some A: G IN. ^ 

Next, the o-minimal hybrid automata of [18] are extended to the probabili- 
stic context. The definition is as in Theorem 5.7 of [3], except for the following 
alterations: naturally, we dispense with the notion of edges connecting control 
modes, and replace them with a set of distributions; also, for every v G V and 
all p G prob{v), the sets inv{v) and pre^{p) are semi-algebraic with rational co- 
efficients, and, for every {w,post,X) G support(^), the set post is semi-algebraic 
with rational coefficients and X = X. Finally, to obtain probabilistic o-minimal 
hybrid automata, we assume that the initial point of the model is unique. 



3 Semantics of Probabilistic Hybrid Automata 

3.1 Concurrent Probabilistic Systems 

The underlying transition system of a probabilistic hybrid automaton will take 
the form of a concurrent probabilistic system [6]. These systems are based on 

^ Observe that a probabilistic timed automaton [16] is a probabilistic multisingular 
automaton such that flow{v, -)i = 1 for each v G V and for each Xi G X. 




36 



J. Sproston 



Markov decision processes, and are a state-labelled variant of the “simple pro- 
babilistic automata” of [21]. Formally, a concurrent probabilistic system 5 is a 
tuple {Q,q^ S, Steps), where Q is a (possibly infinite) set of states, € Q 
is the initial state, £ : Q ^ 2^^ is a function assigning a finite set of atomic 
propositions to each state, 27 is a set of events, and Steps C 27 x Dist(Q) is a 
function which assigns to each state a non-empty set Steps (q) of pairs comprising 
of an event cr and a distribution ir on Q. 

A transition of S from state q comprises of a nondeterministic choice of an 
event-distribution pair (cr, G Steps{q), followed by a probabilistic choice of a 
next-state q' according to v such that v{q') > 0, and is denoted by q q' . 
A path of iS is a non-empty finite or infinite sequence of transitions of the form 
to = qo qi q2 • • • . The special case u = q, for some q G Q, 

is also a path. The following notation is employed when reasoning about paths. 
For a path to, the first state of to is denoted by first {to), and, if u is finite, the 
last state of to is denoted by last{uj). If to is infinite, then step{u>,i) is the event- 
distribution pair associated with the i-th transition for each i G IN. We denote by 
Pathful the set of infinite paths, and by Pathfui{q) the set of paths u) in Pathjui 
such that first{ui) = q. 

An adversary of a concurrent probabilistic system 5 is a function A map- 
ping every finite path w of 5 to an event-distribution pair (cr, v) such that 
(ct, rc) G Steps{last{uS)) . Intuitively, an adversary resolves all of the nondeter- 
ministic choices of S. For an adversary A of S, we define Path^^i to be the set 
of paths in Pathjui such that step{to,i) = A(a;(*^) for all i G IN. Furthermore, 
Pathfui{q) is defined to be the set of paths of Pathfui such that first {uj) = q for 
all to G Pathful- For each adversary, we can define a probability measure Prob^ 
on infinite paths in the standard manner (see, for example, [6]). 

We introduce two state relations for concurrent probabilistic systems, namely 
probabilistic bisimulation and simulation. In the standard manner, the concept of 
weight functions [15] is used to provide the basis of the definition of simulation, 
and bisimulation is defined as a symmetric simulation [21]. Let TZ Q Q\ x Q2 
be a relation between the two sets Qi,Q2i and v\,V2 distributions such that 
vi G Dist((5i) and V2 G Dist(Q2)- A weight function for (i^i,j^ 2) with respect to 
7^ is a function w : Qi x Q2 ^ [0, 1] such that, for all qi G Qi, q2 G Q2' 

1. if w{qi,q2) > 0, then ((71,(72) G TZ, and 

2- Eg'GQ2^(9l,9') = Mil), and Eq'GQi = V2{q2). 

We write i>iTZi>2 if there exists a weight function for (vi, V2) with respect to TZ. 
For example, if Qi = {qi,q[}, Q2 = {g2,?2}> ^i(9i) = M(g'i) = ^^2(^2) = 5, 

^2((?2) = f) and TZ = {((71, (72), (< 7 i, 92)1 92 )}> then a weight function w for 

{h'1,1'2) with respect to TZ is w{qi,q2) = w{qi,q'2) = w{q[,q2) = 5. 

The following definitions, which follow immediately from the probabilistic 
simulations and bisimulations of [15,21], are with respect to the concurrent pro- 
babilistic system S = {Q, q^ ,£, S, Steps). We write q 2^2^ jf there exists a 
transition q q' for some q' G Q. A simulation of S is a, relation TZ G_ Q x Q 
such that, for each (51,(72) G TZ: 




Decidable Model Checking of Probabilistic Hybrid Automata 



37 



1. C{qi) = C{q 2 ), and 

2. if then q 2 for some distribution V 2 such that viTi.V 2 . 

We say that q 2 simulates qi, denoted by qi ^ q 2 , iff there exists a simulation 
which contains (gi, 52)- A hisimulation of S is a, simulation of S which is symme- 
tric. Two states gi, g2 are called bisimilar, denoted by gi ~ g2, iff there exists a 
bisimulation which contains (gi, g2). As any simulation is a preorder, a bismula- 
tion is an equivalence relation. We can define simulation and bisimulation with 
respect to the composition of concurrent probabilistic systems in the standard 
manner [19,5] in order to obtain a notion of relation between two such systems. 

If an equivalence relation TZ on (a finite- or infinite-state) concurrent proba- 
bilistic system S contains a finite number of classes, we can define a finite-state 
quotient concurrent probabilistic system Sfin, the states of which are the equiva- 
lence classes of TZ, and the transitions of which are derived from those of S, such 
that the initial state of S is related to the initial state of Sfin by TZ. We omit 
details for reasons of space. 

3.2 Probabilistic Temporal Logic 

We now present a probabilistic temporal logic which can be used to specify 
properties of probabilistic hybrid automata. In brief, PBTL (Probabilistic Bran- 
ching Time Logic) [6] is an extension of the temporal logic CTL in which the 
until operator includes a bound on probability. For example, the property of 
Section 1 regarding component failure is represented by the PBTL formula 
[{coolant)\/U{alarm)]>Q,ggg, where coolant and alarm are atomic propositions 
labelling the appropriate states. Note that PBTL is essentially identical to the 
logics PCTL and pCTL presented in [9] and [4,7] respectively, and that PBTL 
model checking of finite-state concurrent probabilistic systems may be performed 
using the algorithms of [7,6]. The syntax of PBTL is defined as follows: 

^ true | a | ^ A ^ | | [^3U^]^x \ [^VW^]3A 

where a € AP, A G [0,1], and □€ {>,>}. The satisfaction relation for true, 
a € AP, A and -1 are standard for temporal logic. In the following definition 
of the semantics of the probabilistic operators [^i3fY^2]gA and [^iVW<?2]gA 
we make use of the ‘path formula’ 'PiU'Pg, the interpretation of which is also 
standard; that is, 14 <p 2 is true of a path oj if and only if <p 2 is true at some point 
along CO, and <Pi is satisfied at all preceding points (for a formal description, see, 
e.g., [6]). For a concurrent probabilistic system S, a set A of adversaries on S, 
and a state g of S, the satisfaction relation for g \=^ [^1 3U<1’2]^\ is as follows: 

g (=_4 [^1 3 W T> 2 ]^\ Prob"^{{co I u! G Path^i(q) & co <Pi 14 ^2}) 3 A 
for some adversary A G A. 

The semantics for [<?i VfY<?2]gA is obtained by substituting “all adversaries 
A G A” for “some adversary A G A” in the above equivalence. The concur- 
rent probabilistic system S = {Q,q^ , C, S , Steps) satisfies the PBTL formula ^ 
iff gO h.4 




38 



J. Sproston 



We now introduce VPBTL as a fragment of PBTL involving only universal 
quantification over adversaries. The syntax of VPBTL is defined as follows: 

^ true | false | a \ -■a | <PA(P \ <P\/<P \ [<P\/U<P]zix 

where a, A and □ are as in the definition of PBTL. The following theorem states 
that simulation and bisimulation preserve certain (V)PBTL formulae, and is 
inspired by a conjecture in [5] . The proof follows from similar results of Segala 
[21], which are defined for an action-based, rather than a state-based logic such 
as PBTL, and of Aziz et al. [4], which concern fully probabilistic systems (that 
is, Markov chains, or, equivalently, concurrent probabilistic systems for which 
\ Steps {q) \ = 1 for all states q £ Q). 

Theorem 1. Let S be a concurrent probabilistic system, a set A of adversaries 
of S, let <L\f, <P be formulae of \f PBTL and PBTL respectively, and let qi,q2 £ Q- 

- Ifdidi q2, then q2 implies qi <L>y. 

- //gi ~ q2, then qi |=_4 <P iff q2 

Naturally, for the two concurrent probabilistic systems 5i = {Qi,qi,£i, Si, 
Stepsi) and ^2 = {Q2,q2,h^2,S2,Steps2), if ^ <72> then S2 \=^ implies 
\=ji Similarly, if q^ ~ g2> then 5i if and only if ^2 |=_4 Observe 

that Theorem 1 implies a decidable PBTL model checking procedure for any 
infinite-state concurrent probabilistic system with a finitary bisimilarity relation 
via reduction to the quotient concurrent probabilistic system, which can then be 
verified using the techniques of [7,6] . 

3.3 Semantics of Probabilistic Hybrid Automata 

The semantics of a given probabilistic hybrid automaton H can be represented in 
terms of a concurrent probabilistic system in the following way. The subsequent 
notation is used to reason about the target states of the probabilistic transitions 
of H. Let a £ K", Z £ Rect{X) be a rectangle and X C X. Then a[X := Z] 
denotes the set of valuations such that a' £ a[X := Z] iff a' £ Z and a' = a^ 
for all Xi £ X \ X. Now, consider the valuation a £ IR” and the m-vector 
(77) = [(Z^,X^),...,{Z™,X™)], where, for each j G {l,...,m}, the set Z^ is 
a rectangle and X^ C A is a variable set. Then we generate the m-vector of 
valuations (b) = [b^,...,b™] in the following way: for each j £ {!,..., m}, we 
choose a valuation b-1 £ M” such that b-1 £ a[Xt ■= Z^]. Observe that, for 
any i,j £ {!,... ,m} such that i yf j, it may be the case that a[A* := Z^] and 
a[Xt := Z^] have a non-empty intersection, and therefore it is possible that 
b* = b-1. Let Combinations(a, (77)) be the set of all such vectors (b). In the 
sequel, we use exclusively vectors of the form of (77) which comprise of post- 
conditions and variable sets in the support of a distribution. For the distribution 
fi, we let the vector extract(Ti) = [{posA , X^), {posf^ , X™)] if support(Ti) = 
{{w^,posA,X^), ..., (tc™, post'". A™)}. 

The (time-abstract) concurrent probabilistic system Sh = {QH,q%,h^H,SH, 
Steps fj) of the probabilistic hybrid automaton H = {X ,V,L,init,inv,flow,prob, 
{pre)v(zv) is defined as follows: 




Decidable Model Checking of Probabilistic Hybrid Automata 



39 



^ Qh C y X M" such that (v,a) G Qh iff a G inv{v)] 

— £ Qh such that = (v, init(v)) for v € V such that init{v) ^ 0; 

— for each (u,a) G Qh, we have C{v,a) = L{v); 

— Sh = {0,t}] 

— for each (u,a) G Qh, we have Steps = CtsH{v,a) U DiscH{v,a), 
where: 

• for each S G M>o, there exists the pair (r, 2?(u,b)) G CtsH{v,a) iff 
b G inv{v), and there exists a differentiable function / : [0,i5] — >■ M” 
with / : (0, 5) — >■ M" such that /(O) = a, f{S) = b, and / G flow{q, /(e)) 
and /(e) G inv{v) for all e G (0,J); 

• for each pi G prob{v), if a G pre^{p), then there exists the pair {9, J^(b)) G 
DiscH{v,a), for each (b) G Combinations(a, extract(/r)), iff there exists 
/i G prob{v) such that: 

i^{h){w,c)= ^ p,{w,post\X'^). 

■iG{l,...,|support(/i.)|} & c=b* 

For a state (u,a) G Qh, the definition of the continuous transitions in 
CtsH{v,a) is identical to the analogous definition for non-probabilistic hybrid 
automata, except that we require them to be made according to Dirac distributi- 
ons (that is, with probability 1). The definition of the transitions in DiscH(y,a) 
reflects the intuition that H performs a control switch in the following manner: 
by (a) choosing an enabled distribution nondeterministically; (b) selecting a tar- 
get mode and post-condition set probabilistically; and (c) choosing a successor 
state within the post-condition set nondeterministically. It is easy to verify that 
combining the two nondeterministic choices that comprise the first and third 
steps of the transition into a single nondeterministic selection, in the manner 
of the definition of Sh, results in an equivalent transition. Naturally, if the 
post-condition set of at least one tuple in the support of a distribution /i is un- 
countable, then the set of vectors of the form (b) associated with this set will 
also be uncountable, as will the set of transitions in Discni-) corresponding to 
pL. As a further note, observe that the definitions of (bi)simulation are applicable 
to concurrent probabilistic systems of probabilistic hybrid automata. Finally, 
a notion of time divergence can be associated with adversaries of the concur- 
rent probabilistic systems of probabilistic hybrid automata; we omit details for 
reasons of space. 

4 Model Checking Subclasses of Probabilistic Hybrid 
Automata 

4.1 Probabilistic Multisingular and O-Minimal Hybrid Automata 

The results of [1] and [18], which state the existence of finite bisimulation quo- 
tients of non-probabilistic multisingular and o-minimal hybrid automata res- 
pectively, can be extended to the probabilistic context in the following way. 
Firstly, the region equivalence of [2,1] can be used to subdivide the infinitary 




40 



J. Sproston 



state space of a probabilistic multisingular automaton M into a finite number 
of equivalence classes. Without loss of generality (see [3]), let all endpoints of 
rectangles used in the description of M be non-negative integers, with the ma- 
ximal such integer denoted by c. For any t G M, let [t\ denote its integral part 
and frac{t) its fractional part. For a vector a, let [aj denote the vector whose 
ith coordinate is [a^J, and /roc(a) the vector whose tth coordinate is frac{a.i). 
For each mode v £ V, let = [Ci, ■•■,Cn] be the n-vector such that the ith 
element of {C"") is flow{v)i if flow{v)i yf 0, and is 1 otherwise. Let be the 
equivalence relation on M” such that a b iff, for each Xi,Xj £ X, (1) 
LCa*J = LCb*J, (2) frac{Qa.i) = frac{Qhi), and (3) frac{Qa.i) = fraciQaj) 
iff frac{Qhi) = frac{Qhj). Two states (v,a) and (w,b) are region equivalent, 
written (u,a) (rc,b), if (1) v = w, (2) for each Xi £ X, either [a^J = [b^J, 

or both a^ > c and b^ > c, and (3) frac{a) frac{h) (our notation is adapted 
from [10]). Intuitively, for each control mode v £ V, region equivalence subdi- 
vides IR” into a finite grid of unit hypercubes, which are in turn subdivided 
according to the flow gradients fiow{v, -)i = ki of each Xi G X . 

Lemma 1. Let M he a probabilistic multisingular automaton. Region equiva- 
lence is a finite bisimulation of Sm- 

Proof. Clearly has a finite number of equivalence classes; therefore, it re- 
mains to show that is a bisimulation. The case for the continuous transitions 
in the sets CtsM{-) is similar to that in the non-probabilistic context, and the- 
refore we concentrate on the discrete transitions in Hzscm(-)- 

Observe that the set of valuations in a given region equivalence class is eit- 
her contained within any rectangle Z used in the description of M , or is dis- 
joint from Z. In particular, all valuations within such a class must be in the 
same pre-condition sets of M, and therefore enable the same distributions for 
choice. That is, if two states (u,a),(u,b) G Qm are such that (v,a) (i’,b), 

then, for any /i G prob{v), we have a G pre^^fj.) if and only if b G prCy^p,). 
Therefore, there exists an event-distribution pair £ Stepsj^{v,a) if and 

only if there exists {0,^“^) G Stepsj^{v,h), such that both and are de- 
rived from p,. Now we show that . A standard fact is that, given 

(v,a) (w,b), for any tuple (w,post,X) £ V y. 2^ y 2^ such that post is 

a singleton, we have (■u;,a[A := post]) (■u;,b[A := post]). Furthermore, if 
the tuples {w,post, X), {w,post' , X') £ V y 2^"' x 2^ are such that (w,a[A := 
post]) = (w,a[A' := post']), then it must be the case that (rc,b[A := post]) = 
(w,b[A' := post']). The combination of these facts then gives us that, for 
(w,c), {w, d) G Qm which are such that {w,c) {w,d): 

u^{w,c)= p{w,post',X') = p{w, post' ,X') = n^{w,d) , 

&cc=a[X^ :=post'^] Szd=h[X^ :=post'^] 

where k = |support(/i)|. The fact that then follows. We can repeat this 

process for all region equivalence classes, and all distributions enabled in these 
classes, to conclude that satisfies the properties of bisimulation. □ 

Secondly, we show that the model checking results for o-minimal hybrid au- 
tomata of [18,3] transfer to the probabilistic context. Observe that the previous 




Decidable Model Checking of Probabilistic Hybrid Automata 



41 



decidability results for this class necessitate the decoupling of discrete and con- 
tinuous behaviour of the hybrid automata; more precisely, all variables are reset 
to a new value at every discrete transition. The result of [18] then shows that, for 
each control mode v &V, the associated continuous state space of v has a finite 
bisimulation quotient. This quotient is obtained after an initial subdivision of 
the continuous space of v according to the invariant set of v, the pre-condition 
sets of all the outgoing discrete transitions of v, and the post-condition sets of 
all incoming discrete transitions of v. Similarly, in our context, we require that 
such an initial subdivision is made according to pre^{p), for all pL G prob{v), in 
addition to inv{v) and all post-condition sets post appearing in tuples of the 
form (v,post,X) £ support(^'), for all pi' G prob{v') and v' G V. 

Consider the probabilistic o- minimal hybrid automaton O, and let (v,a), 
(v,b) G Qo be two states such that a, b G pre^{pi) for some pi G prob{v). Be- 
cause the reset set for all tuples (w,post,X) G support(/x) is the full variable 
set X, it follows that Combinations(a, extract(/i)) = Combinations(b, extract(/i)). 
Intuitively, given that the distribution pi is enabled in (u,a) and (u,b), the di- 
stinction between the valuations a and b is lost after taking pi. Therefore, the 
sets of concurrent probabilistic system distributions corresponding to the choice 
of p, is the same for (u,a) and (w,b). Now consider the case in which (u,a) and 
{v, b) lie in the intersection of the pre-condition sets of multiple distributions 
G prob{v), where I G {1, ..., |pro&(w) [}. Then, extending our intuition 
from the single distribution p to the set {pi, ...,pi}, we have the strong charac- 
teristic that Disco{v,a) = Disco{v,h). Such intersections of pre-conditions are 
further subdivided with respect to continuous transitions using the methodology 
of [18]. Given that (v,a) and (w,b) lie in the same portion of the state space 
according to this subdivision, we conclude that (v,a) and (u, b) are bisimilar. 

Lemma 2. Let O be a probabilistic o-minimal hybrid automata. O has a finite 
bisimulation quotient. 



Corollary 1. The PBTL model checking problems for probabilistic multisingu- 
lar automata and probabilistic o-minimal hybrid automata are decidable. 

4.2 Probabilistic Rectangular Automata 

We now introduce a model checking strategy for initialised probabilistic rec- 
tangular automata, based on similar results in the non-probabilistic context of 
[20,12]. From an initialised probabilistic rectangular automaton i?, we construct 
a probabilistic multisingular automaton Mr, such that Mr is a sufficient ab- 
straction of R which can subsequently be verified. More precisely, each variable 
Xi £ X of R is represented by two variables j//(q,y«(i) G 3^ of Mr, with the 
intuition that tracks the least possible value of Xi, whereas tracks its 
greatest possible value. Therefore, singular flow conditions for (respectively, 
yu(i)) are derived from the minimal (respectively, maximal) slopes that Xi may 
take in R. Furthermore, the probabilistic edge relation of Mr updates ypp and 
yu{i) so that the interval [ypi),yu(i)] represents the possible values of X{. For ex- 
ample, consider Figure 2(a); say the current control mode is v, and that the flow 




42 



J. Sproston 




Fig. 2. (a) Updating the value of Xi. (b) Mn simulates R. 



condition flow{v)i of the variable Xi is the rectangle [/c;, k^]- As time passes, the 
possible values of Xi are contained within an envelope, the lower (respectively, 
upper) bound of which is represented by (respectively, yu{i))- If a control 
switch occurs, the values of yy^i^ and yu{i) must continue to represent the possible 
values of this may involve resetting or y„(i), or both, even if xi is not 
reset by the corresponding control switch. In Figure 2(a), at time ^ a distribution 
/i is chosen, where pre„(/i) = [c, oo), and say a tuple {w,post,X) € support(/i) 
is probabilistically selected for which Xi ^ X. Then, for to correctly repre- 
sent the lower bound on x, it must be updated to c when emulating this control 
switch, as its value is below c when the distribution y was selected. This reflects 
the standard intuition in the non-probabilistic case of [12]. 

Let R = (A, U, L, inv^,flow^, prob^, {pre^)y^v) be an initialised pro- 

babilistic rectangular automaton subject to the following simplifying assumpti- 
ons. For all control modes v G V, we have inv^{v) = K", and the rectangle 
flow^{v,-) is compact; for all y € prob^{v), the rectangle pre^{y) is compact, 
and for each {w, post, X) G support(/r), the rectangle post is compact. ^ Then 
= (3^, V, L, init^^, inv^^ , flow^^ , prob^^ , {pre^^)v(=v) is the probabilistic 
multisingular automaton constructed in the following way. 

Variables y = {yi, ..., t/ 2 n}, where the /(i)-th variable ypi) represents the lower 
bound on the i-th variable Xi G X of R, the u(i)-th variable yu(i) represents 
the upper bound on Xi G X, and l(i) = 2i — 1, u(i) = 2i. 

Initial and invariant sets. For each control mode v G V and each Xi G X, we 
have init^^{v)pi) = init^^^ = init^{v)i, and inv^^{v) = IR^". 

Flow inclusion. For each control mode v G V and Xi G X, \iflow^{v, -)i = [Z, u\ 
then flow^^{v, •)/(*) = I and flow^^{v, -)„(i) = u. 

Probability distributions. For each control mode v G V and G prob^{v), 
there exists a corresponding set {y^^ , y^^} C prob^^{v), which takes 
the following form. For each tuple (w,post^,X) G support(^^), and for all 
j G {!,..., 4}, a corresponding tuple (w, postj^^,Vj) G support(^^®) exists, 
such that y^^{w,post^^ ,Yj) = y^{w,post^,X). Let = prty{y^)i 

^ Note that it follows from [12] that all of these assumptions may be dropped; ho- 
wever, in such a case, the construction of Mu requires significant book-work that is 
independent of probabilistic concerns. 



Decidable Model Checking of Probabilistic Hybrid Automata 



43 



and [/', m'] = (post^)i for each z S { 1 , n}. If Xi G X, then {post^^)pi) = l[, 
{postf^)u(i) = u'i, and ypi),yu{i) G Yj for each j G 4 }. However, if 

Xi ^ X, then {post^^)pip {post^^)u(i), and Yj are defined as follows: 



{post^^)pi) 


•) 


(postf«)„(i 


) = 


yi(i) G ^li 


(post^^)yi) 


— •) 


{pOStf^)u{i 


) — 


yi(i) 1 yuii) ^ Y2, 


{post^^)pi) 




{pOStf^)u{i 


) = u'y 


no requirement on I3 


{postf^)i(i) 




{pOStf^)u(^i 


) 





Pre-condition sets. For every v G V and /i^ G prob^{v), we define the pre- 
condition sets for C prob^^{v) in the following way. For 

every z G ,rz}, if pre^{pL^)i = [l,u], let: 

pre^^{y^^)i(i) = (-00,0, = [l,u]; 

pre^^iy^^)i(i) = (-00,/), pre^^{y^^)u{i) = (u,oo); 

pre^^iy^^)i(i) = [l,u], pre^^{f^^^)u(i) = [l,u]] 

pre^^{yf^)i(i) = [Uu], pre^^{y^^)u(i) = {u,oo). 

The function 7 : Qmr -f 2 '^« of [ 12 ], where 7(u,a) = {u} x 7 T’F;^[a;(i), a„(i)j, 
can also be used in the probabilistic context to relate a set of states of i? to a state 
of Mfi. We now propose a strategy for model checking R via the construction 
of M/j. More precisely, R is simulated by (that is, the initial state of R is 
simulated by the initial state of Mfi), and, by Theorem 1 , if a VPBTL formula 
is satisfied by Mr, then this is sufficient for concluding that <P is also satisfied 
by R. Observe that the simulation is obtained by viewing 7 as a relation. 



Lemma 3 . Let R be a probabilistic rectangular automaton, and Mr be the pro- 
babilistic multisingular automaton constructed from R. Let q G Qmr be a state 
of Smr, and let r G 7(g) be a state of Sr. Then r ^ q. 



We omit the proof of Lemma 3 for reasons of space. An example of the way 
in which Mr forward simulates R is shown in Figure 2 (b). Let the variable set 
of R contain two variables, Xi and X2- Consider a state r G Qr of R, and a 
state q G Qmr of Mr which are such that r G 7(g). Say R nondeterministi- 
cally selects an enabled distribution for choice, which, by construction of 
Mr, can be matched in g by one (and because the pre-conditions of each dis- 
tribution are disjoint, only one) of the four distributions let 

be this distribution, for some j G {!,..., 4 }. Let y^{w,posti,Xi) = 
y^{w,post2,X2) = g, and y^{w,post§, X3) = From the construction of Mr, 
yf^{w,postff,Yi) = g, yf^{w,postfjf,Y2) = g, yf^{w,postf^^,Y:i) = 

Say post2, X2 and postf, X3 are such that, when applied to r, they result in the 
same target sets of states. Then the maximal and minimal values of xi, X2 enco- 
ded in Mr will be the same for post^, X2 and post^, X^, and therefore the proba- 
bility of Mr making a transition to the state g2, which encodes these values, will 
heir^^{q2) = yf^{w,postf2 ,Y2)+yf^{w,postff^ ,Y^) = g-|-i = |. We encode 
the maximal and minimal values reached by Mr from g after the probabilistic 




44 



J. Sproston 



choice of (w, Yi) to be <7i; therefore, = jJi^^{w,post’^i ,Yi) = 

Say the rectangular sets encoded by q\ and q2 via 7 overlap (see Figure 2(b)). 
Consider the case in which, after probabilistically choosing either of the tu- 
ples {w,post^,Xi) and {w , post^ , X2) , the same target state ri is selected by R, 
which, naturally, is in the intersection of the state sets defined by 7((?i) and 7(92)- 
We let the choice of target state after the probabilistic choice of {w,post§ , X^) 
to be T2, which is in 7(92) but not 7(<7i). Then, from our view of 7 as inducing 
the simulation, we have r\ ^ q\ and ri , r2 ^ (72 • Say is the distribution of 
Sn corresponding to the states ri and r2] then v^{ri) = p,^{w,post^,Xi) + 
fi^{w,post2,X2) = I + g = 5) and v^{r2) = p.^{w,post^ , X^) = To show 
that :< the weight function w, which relates and via is defi- 
ned: let w{ri, qi) = i, w(ri, (72) = w{t2, (72) = Note that the weight function 

is obtained from the probabilities assigned by pL^ to tuples in its support; this 
fact can be used to derive a weight function for any probabilistic transition of R 
and Mr. 

The argument that 7 induces a simulation with regard to continuous tran- 
sitions follows from the non-probabilistic precedent of [20,12]. In Figure 2(b), 
both qi and (72 can simulate all of the transitions from any state lying in the 
intersection 7(91) ri7((72), such as ri. For example, the distribution with the pre- 
condition Zi is enabled in ri, qi and (72, after some time has elapsed. However, 
as T2 is in 7(^2) but not 7(<7i), the state (72, but not qi, can simulate the choice 
of the distribution with the pre-condition Z2 by r2- 

Proposition 1. Let R, Mr he defined as in Lemma 3. For any WPBTL formula 

if Smr h _4 d> then Sr \=^ d>. 



5 Conclusions 

Model checking for hybrid systems is well known to be expensive, and the stra- 
tegies presented in this paper are no exception. For multisingular automata, the 
size of the region quotient is exponential in the number of variables used and 
the magnitude of the upper bounds used in the description of the sets of the 
model. Furthermore, the verification algorithm for PBTL [7,6] is polynomial in 
the size of this quotient and linear in the size of the formula. Therefore, further 
work could address the inefficiencies of this method, for example exploiting mo- 
del checking methods of [13] for rectangular automata. Formalisms which admit 
continuous probabilistic behaviour, such as stochastic hybrid systems [14], are 
also of interest, and could be subject to a variant of the model checking technique 
for timed automata with continuously distributed delays of [17]. 



References 

1 . R. Alur, C. Courcoubetis, N. Halbwachs, T. A. Henzinger, P.-H. Ho, X. Nicollin, 
A. Olivero, J. Sifakis, and S. Yovine. The algorithmic analysis of hybrid systems. 
Theoretical Computer Science, 138:3-34, 1995. 




Decidable Model Checking of Probabilistic Hybrid Automata 



45 



2. R. Alur and D. Dill. A theory of timed automata. Theoretical Computer Science, 
126:183-235, 1994. 

3. R. Alur, T. A. Henzinger, G. Lafferriere, and G. J. Pappas. Discrete abstractions 
of hybrid systems. To appear in Proceedings of the IEEE, 2000. 

4. A. Aziz, V. Singhal, F. Balarin, R. Brayton, and A. Sangiovanni-Vincentelli. It 
usually works: the temporal logic of stochastic systems. In Proc. 7th CAV, volume 
939 of Lecture Notes in Computer Science, pages 155-165. Springer- Verlag, 1995. 

5. C. Baier. On algorithmic verification methods for probabilistic systems, 1998. 
Habilitation thesis. University of Mannheim. 

6. C. Baier and M. Kwiatkowska. Model checking for a probabilistic branching time 
logic with fairness. Distributed Computing, 11:125-155, 1998. 

7. A. Bianco and L. de Alfaro. Model checking of probabilistic and nondeterministic 
systems. In Proc. PST&TCS’95, volume 1026 of LNCS, pages 499-513. Springer- 
Verlag, 1995. 

8. L. de Alfaro, M. Kwiatkowska, G. Norman, D. Parker, and R. Segala. Symbo- 
lic model checking of concurrent probabilistic processes using MTBDDs and the 
Kronecker representation. In Proc. TACAS’OO, volume 1785 of LNCS, pages 395- 
410. Springer- Verlag, 2000. 

9. H. Hansson and B. Jonsson. A logic for reasoning about time and reliability. Formal 
Aspects of Computing, 6(5):512-535, 1994. 

10. T. A. Henzinger, B. Horowitz, and R. Majumdar. Rectangular hybrid games. In 
Proc. CONCUR’99, volume 1664 of LNCS, pages 320-335. Springer- Verlag, 1999. 

11. T. A. Henzinger, B. Horowitz, R. Majumdar, and H. Wong-Toi. Beyond HyTech: 
hybrid systems analysis using interval numerical methods. In Proc. HSCC’OO, 
volume 1790 of LNCS, pages 130-144. Springer- Verlag, 2000. 

12. T. A. Henzinger, P. Kopke, A. Puri, and P. Varaiya. What’s decidable about hybrid 
automata? Journal of Computer and System Sciences, 57(1):94-124, 1998. 

13. T. A. Henzinger and R. Majumdar. Symbolic model checking for rectangular hybrid 
systems. In Proc. TACAS’OO, volume 1785 of LNCS, pages 142-156. Springer- 
Verlag, 2000. 

14. J. Hu, J. Lygeros, and S. Sastry. Towards a theory of stochastic hybrid systems. 
In Proc. HSCC’OO, volume 1790 of LNCS. Springer- Verlag, 2000. 

15. B. Jonsson and K. G. Larsen. Specification and refinement of probabilistic proces- 
ses. In Proc. 6th LICS, pages 266-279. IEEE Computer Society Press, 1991. 

16. M. Kwiatkowska, G. Norman, R. Segala, and J. Sproston. Automatic verihcation of 
real-time systems with discrete probability distributions. To appear in Theoretical 
Computer Science, special issue on ARTS’99: Formal Methods for Real-time and 
Probabilistic Systems, 2000. 

17. M. Kwiatkowska, G. Norman, R. Segala, and J. Sproston. Verifying quantitative 
properties of continuous probabilistic timed automata. In Proc. CONCUR’OO, 
LNCS. Springer- Verlag, 2000. 

18. G. Lafferriere, G. Pappas, and S. Yovine. A new class of decidable hybrid systems. 
In Proc. HSCC’99, volume 1569 of LNCS, pages 137-151. Springer- Verlag, 1999. 

19. R. Milner. Communication and Concurrency. International Series in Computer 
Science. Prentice Hall, 1989. 

20. A. Olivero, J. Sifakis, and S. Yovine. Using abstractions for the verification of 
linear hybrid systems. In Proc. 6th CAV, volume 818 of LNCS, pages 81-94. 
Springer- Verlag, 1994. 

21. R. Segala and N. Lynch. Probabilistic simulations for probabilistic processes. Nor- 
dic Journal of Computing, 2(2):250-273, 1995. 




Invariant-Based Synthesis of Fault-Tolerant 

Systems 



K. Lano^, David Clark^, K. Androutsopoulos^, and P. Kan^ 



^ Department of Computer Science, 

King’s College London, Strand, London WC2R 2LS 
^ Department of Computing, Imperial College, London SW7 2BZ 



Abstract. Statecharts are a very widely used formalism for reactive 
system development, however there are problems in using them as a 
fully formal specification notation because of the conflicting variants of 
statechart semantics which exist. In this paper a modular subset of sta- 
techart notation is defined which has a simple semantics, and permits 
compositional development and verification. Techniques for decomposing 
specifications in this notation, design strategies for incorporating fault 
tolerance, and translation to the B formal language, are also described, 
and illustrated with extracts from a case study of a fault tolerant system. 



1 Introduction 

Finite state machines (FSMs) are highly recommended as a design method for 
safety-related systems of SIL levels 2 and above in the lEC 61508 standard [4]. 
Statecharts are based on state machines but add extra capabilities of modulariza- 
tion and expressiveness: grouping of states into superstates (OR composition) 
and grouping of OR states into concurrent collections (AND composition) . Ho- 
wever, in terms of their semantics, statecharts are much less transparent and 
more difficult to analyse than FSMs, a situation which is compounded by the 
conflicting variants of statechart semantics which exist. By taking advantage of 
the characteristic structures of reactive systems, a subset of classical statecharts 
may be selected as a modular specification notation for reactive systems, inclu- 
ding real-time and fault-tolerant systems. 

Section 2 defines SRS. Section 3 describes the overall development process 
and the translation from SRS to B AMN [9] . Section 4 illustrates the process on 
the case study. 

2 Structured Reactive Systeru (SRS) Statechart Notatiou 

SRS is a modular subset of statecharts in which strong scoping on the parts 
of the statechart affected by a given event is imposed. An SRS statechart S is 
an AND composition of a set of modules: OR states which do not have AND 
states as direct or indirect substates. Modules are organised into a (client-server) 
hierarchy: a module transition may only send events to modules lower in the 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 46-57, 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 




Invariant-Based Synthesis of Fault-Tolerant Systems 



47 



hierarchy (the receivers of the module). Transitions in a SRS module M have 
labels of form t : e[G]/ei ^ ^ e„ where t is an (optional) transition name, e 

the name of the event triggering t, G is a logical guard condition, and the Ci are 
the events generated by t. G is optional and defaults to true. The Ci are also 
optional. The Ci are events of modules in receivers{M), and only the states of 
modules in receivers{M) can be referred to in G. 

Appendix A gives the formal definition of SRS and module systems. Typically 
modules in a module system represent sensors, controllers, subcontrollers, and 
actuators. 

Figure 1 shows the typical arrangement of modules for a reactive system 
(LHS), and the associated hierarchy of modules under the receivers relation 
(RHS). Subcontroller 1 and Subcontroller 2 are the receivers of Controller, etc. 
Actuator 3 has transitions for gl and g2. Each module is a separate OR state 
within an overall AND state representing the complete system. 




Controller 




J Actuator 1 



Fig. 1. Typical subsystem structure 



3 Development Method 

In our method, RSDS, reactive control systems are initially represented using 
data and control flow (DCFD) notation. Usually the behaviour of sensors, con- 
trollers and actuators will be specified using SRS modules in the notation of 
the appendix. Initially only sensors and actuators will have explicit state ma- 
chine/module representations, and the specification of the mediating control- 
ler(s) will be implicit in a system invariant P in terms of sensor and actuator 
states. The form of P can be used to guide the decomposition of the controllers 
and to synthesise control actions, as described in [6]. Invariants of SRS modu- 
les are expressed in the following language. Let S consist of modules Mi, . . ., 
Mn where some of these modules describe the behaviour of sensors, others of 
controllers and others of actuators. For each module M a variable m^state is 
defined, of type Statesflatten(M), and formulae based on equalities m^state = si 
are used to express the current state of module M . 

Except for the most trivial systems, it is necessary to modularise the specifi- 
cation of the control algorithm, in order to obtain analysable descriptions. There 
are several ways in which such a decomposition can be achieved: 





48 K. Lano et al. 

1. Hierarchical composition of controllers: events e are dealt with first by an 
overseer controller S which handles certain interactions between components, 
and e (or derived events) are then sent to subordinate controllers responsible 
for managing the individual behaviour of subcomponents. 

This design can also be used to separate responsibility for dealing with cer- 
tain aspects of a control problem (such as fault detection) from the cal- 
culation of control responses. For example in the steam boiler system of 
[1], detection and responses to inputs that indicate failure of components 
are handled in separate controllers to those which handle non-failed signals, 
using a “chain of responsibility” design pattern. A similar approach is used 
in the case study of Section 4. 

2. Horizontal composition of controllers: events are copied to two separate con- 
trol algorithms and S 2 , which compute their reactions independently. 

3. Decomposition by control mode/phase: A separate controller is specified for 
the control reactions to be carried out in each mode or phase of the system. 

The first two are based on the physical decomposition of the actual system, 
whilst the third is based on temporal decomposition. 

Invariants of controllers are decomposed into invariants of subcontrollers 
when the controller is structurally decomposed. A control algorithm is then 
synthesised for each of the controllers, based on their invariants [6]. 

Controller specifications expressed in a SRS module with invariants produce 
B machines whose set of operations correspond to the events which the controller 
responds to. 

If the SRS description has a tree structured receivers relation, then the SRS 
structuring can be mapped directly to a corresponding structure in B: if module 
D is in receivers(C), then the machine C for C INCLUDES the machine D' . 



4 Case Study: Fault Tolerant Production Cell 

This system [8] is one of a series of industrial case studies based on automated 
manufacturing requirements. It involves the processing of metal pieces which 
enter the system on a feed belt, are conveyed by robot arms to a press, processed 
in the press, and then conveyed by the robot to a deposit belt. The layout of 
this version differs from the simple production cell [7] in two ways: (i) there is no 
crane to make the whole process cyclical; (ii) there are two presses rather than 
one. Otherwise, the movements of the robot arms etc. are identical. 

The specification makes no requirement of alternating the presses. Should 
one fail, then the other takes on all the blanks coming into the system, and once 
the faulty press is working properly, it is brought back online again. 

The system is expected to conform to the three state safety model shown in 
Figure 2, with transitions between the normal operating and recoverable failure 
states, and transitions from these to the unrecoverable failure state. In the follo- 
wing section we prove this conformance by constructing abstraction morphisms 
from the full state space of the system to the 3 state model. 




Invariant-Based Synthesis of Fault-Tolerant Systems 



49 




Fig. 2. Three State Failure Model for Production Cell 



4.1 Statecharts and Abstraction 

For each component of the cell, a series of abstractions and abstraction mor- 
phisms can be constructed. At the lowest level are the tuples of states of indi- 
vidual sensors and actuators within the component. For a press, for example, 
the basic state is a tuple (ok, lows, mids, ups, bs, ms, pts, stm, pval) where ok is a 
boolean indicating if the press is failed or not, lows is the lower position sensor, 
etc, bs is the blank in press sensor, stm is a boolean indicating that the arm is 
clear of the press, pts is the press timer state {idle, active or expired), pval is 
the timer value {pul, plm, pmu reflecting the deadlines for movement from the 
upper to the lower position, etc) and ms is the press motor state. For a single 
press there are therefore 4*3^ basic component states (8748). By constructing 
a suitable abstraction morphism to a simpler component model, we can create a 
clearer and more easily verified specification, both at the requirements and SRS 
level and in the B code. 

In this case the high-level state {pstate) of a press is one of: (i) waiting to 
receive {w2r): ok = true, {bs = no or stm = false), mids = yes, lows = no, 
ups = no, ms = off , pts = idle; (ii) processing: ok = yes, bs = yes, lows = no, 
ms = up, pts = active, stm = true, pval = pmu, ups = no; (iii) completing: ok = 
yes, bs = yes, ms = down, lows = no, stm = true, pts = active, pval = pul; 
(iv) waiting to deliver {w2d): ok = yes, {bs = yes or stm = false), lows = yes, 
mids = no, ups = no, ms = off, pts = idle; (v): returning: ok = yes, bs = no, 
mids = no, ups = no, ms = up, stm = true, pts = active, pval = plm; (vi): 
failure: any state where ok = false, ms = off will also be true in this state. 

States where = 0 or lows = 0 or mids = 0 or ups = 0, where any 
two of lows, mids and ups are yes, or where stm = false with ms yf off, or 
pts = expired with ms yf off are excluded from (i) to (v): any event which leads 
to any of these conditions triggers a transition to failure. There are transitions 
to failure from any of the other states (which can be grouped into a superstate 
normal). Operational invariants of the system can then be simply defined using 
the abstract state pstate, for example 



pstate = w2r A Q)stm = true A bs = yes O pstate = processing 







50 



K. Lano et al. 



The module defining the control algorithm derived from these invariants is shown 
in Figure 3. States are expressed in the form {lows, mids, ups, hs, ms, pts, stm, pval), 
superstates indicated by letters C, R, etc. 



arrives _mid/reset_ti 



nojio,no,no,up^ctive 



, 1 

I no,yes,no,no,ofT,idl 




becomes_notstm 



yes,no,no,ofr,idle ] 
not(stm) J 






' ' J>lank_arrives 



io,yes,no,.ves,off,idle ] 
notfstm) 



motor_up''set _ptinifr(pmu) 



, becomes_stm/motor_up'‘ 

set _ptimer(pmu) / 

[no, .ves,no, yes, up, active 1 ,/ 

^1 stm,pniu l-^ 

I leaves _mid 

V ^ 

[no,no,no,yes,up,actlve 1 
^ sUn,pmu I 



arrives _upper/reset_timer'' 

, motor _down ''set _ptimer(pul) 



ryesjiojio,no,up,active 1^ 
stm,plm j 

A 

I ' , blank_leaves/motor_up'' 

I &ecomes_j/7n/motor_up'' 's start_ptimer(pbn) 

' start _ptimer(plm) 

f yes,no,no,yes,o£f,ldle 1 



no,no,no,off,idle 

notfstm) 



' blank_leaves 






/ becomes_stm 
, becomes _ W2D 

f yes,no,no,yes,ofr4dle 1 "otstm 1 yes,no,no,yes,off4dle 

'I notfstm) J ” 1 stm 






I leaves _upper 

V , 

, |no,no,no,yes,down,actlve ] 
stm,pul J 



)o,yes,no,yes,down,actlve | 



t_lower/reset_dmer'' 

stop_motor 



Fig. 3. Normal State Controller for Press 



A further abstraction to a two state (normal/failure) model can then be made 
simply on the basis of the ok state value. 

Each of the components of the production cell can likewise be abstracted to 
a basic safety /operational model consisting of two states, a normal and a failed 
state, and a transition from normal to failed and self-transitions on both states. 
The flattened product of these component models therefore has 256 states in 
the form of tuples: (feed belt, elevating rotating table, robot, robot arm 1, robot 
arm 2, press 1, press 2, deposit belt). 

An abstraction morphism from this state machine to the three state global 
safety model for the cell is defined by: 

1. All components normal: 1 — > N 

2. One component failed, either press 1 or press 2 1 — ^ RF 

3. Both presses failed, or some other component failed 1 — >■ UF 

Transitions are mapped correspondingly. This abstraction is an adequate refine- 
ment. A complete reachability analysis concerning failure modes can therefore 




Invariant-Based Synthesis of Fault-Tolerant Systems 



51 



be carried out on this model more effectively than on the fully flattened state 
machine model of the system (adequacy of cr : C ^ A implies that t is reachable 
from s in a concrete model C iff a{t) is reachable from cj{s) in the abstraction 

4.2 Controller Decomposition 

Because of the relative independence of the control invariants for the separate 
sub-components of the cell, a hierarchical control structure can be used (Fi- 
gure 4). Interaction between components is managed by supervisor controllers 
which send commands to the controllers of the individual components involved 
in a particular interaction. Detection of failures is handled by the FDC (Failure 
Detection Controller). 




Fig. 4. Decomposition of Controllers 



Failure detection is managed using the hierarchical chain of responsibility 
pattern described in [1] for fault-tolerance: at each cycle, the vector of sensor 
readings are checked for fault conditions, and failures notified to relevant com- 
ponents (if an unrecoverable failure occurs, a shutdown message is sent to the 
main controller, which then propagates it to individual controllers). Otherwise, 
the main controller is only sent events which do not represent failures. 



52 



K. Lano et al. 



4.3 Fault Detection 

Checking for faults at the initial stage of sensor reading delivery to the system is 
essential if properties relating to individual polling cycles must be detected. For 
example, it could be regarded as a sign of sensor failure if both a hecomesstm 
{stm goes from false to true) event and a blank-arrives {bs goes from no to 
yes) event happen for a particular press within the same cycle (normally the 
arm delivering a blank would only start moving away once the blank had been 
confirmed to have arrived, or a timeout expired). This could not be detected at 
inner control levels as these know nothing of polling cycles. 

Fault conditions are described by sets of invariants or control rules to be 
implemented by the FDC. Examples of these for a press are: (i) the arm moving 
into the vicinity of the press while the press is moving: 

stm = false A ms yf off ^ ok = false 

Or (ii) the blank falling off during a movement: 

pstate = completing A bs = yes A Q)bs = no Q) ok = false 

For the feed belt we have rules: (i) If feed belt motor is on and SI (blank 
sensor at start of belt) detects presence of blank, but state of SI does not change 
in next cycle, then the motor has failed, (ii) If S2 (blank sensor at end of belt) 
changes from blank present to absent, and motor is off, then the motor has failed 
or blank has fallen off belt. 

The system has no way of deducing the failure of the feed belt sensors if 
these both fail at the same time. Both the belt motor and traffic lights depend 
on the signal from SI. If SI fails at no during an idle period of the belt, then 
the belt motor will not be triggered, and the traffic lights will remain green. In 
this situation, blanks can begin to pile up on top of each other at SI, which 
may eventually cause damage to the system. Only at this time will some sensor 
not included in the system specification as it stands detect the failure. For this 
reason, we assume that sensors give a 0 reading (distinct from yes and no) to 
indicate their failure. 

Fault detection rules are implemented in the FDC, which is directly called 
by the outer controller. The outer controller has the form: 

MACHINE OuterLevel 

INCLUDES FailureDetectionController 

VARIABLES 

fbok, fbtok, . . . 

INITIALISATION 

fbok ■- TRUE II fbtok := TRUE || ... 

OPERATIONS 

cycle{curr_Sl, curr_S2, curr_S20, clockTick) = 

PRE curr^Sl : SENSOR^STATES A ... A curr_S2Q : SENSOR^STATES A 
clockTick : TIME 

THEN 

/* call FDC: */ 




Invariant-Based Synthesis of Fault-Tolerant Systems 



53 



fbok < — check_feedbelt_sensors{curr_Sl, curr_S2) || 

fbtok < — check_feedbelt-rTable{curr_Sl, curr_S2, curr_S3, curr_S6) || 

END 

END 

The checking of failure conditions is split into checks for individual compo- 
nents (eg. the first check operation) and checks concerning the interaction of 
several components (the second operation). The FailureDetectionController has 
the form: 

MACHINE FailureDetectionController 
INCLUDES MainController 



OPERATIONS 

res < — check-feedbelt-sensors{cSl, cS2) 
PRE cSl, cS2 : SENSOR.STATES 

THEN 

IF / * rule (i) for feedbelt * / 
{al-fbelt_motor_switch = on A 
cSl = yes) or 
/ * rule (m) for feedbelt * / 
{al_fbelt_motor_switch = off A 
cSl = no) or ... 

THEN 

res ■- FALSE 

ELSE 

res := TRUE 

END 

END; 



END 

5 Conclusion 

This paper has defined improvements to statechart notation for use in safety 
related reactive systems, and demonstrated the use of this notation and associa- 
ted refinement rules on a case study of a fault-tolerant production cell control 
system. Tools have been developed to assist in the construction of DCFD and 
statechart models of a reactive system, and in the construction of abstraction 
mappings between refined and abstract statechart models [2]. The benefits of the 
approach include a systematic approach to reactive system development, using 
mainly graphical rather than explicitly formal notations, without sacrificing ri- 
gor. 

The approach taken is an advance on previous work, since it incorporates 
a concept of refinement and safety into software tools for statecharts, which 



sl-blankFB_start = yes A 



sl-blankEB_start = yes A 




54 



K. Lano et al. 



Statemate and RSML do not. In addition, it provides a structured translation 
from statecharts into the B notation, instead of the translation to flat B modules 
given in [10]. 

A related approach to reactive system specification in B is that of [11]. Struc- 
turing is introduced as part of refinement in this approach, however, which may 
make it less simple to use for non-software engineers. In our approach the struc- 
ture of specifications and implementations is identical as both are derived from 
the SRS speciflcations. 

References 

1. M. Ali, B Specification of Steam Boiler, MSc thesis, Dept, of Computing, Imperial 
College, 1998. 

2. K. Androutsopoulos. The Reactive System Design Tool, ROOS Project report, 
Department of Computing, Imperial College, 1999. 

3. I. Hayes, A Survey of Data Refinement and Full Abstraction in VDM and Z, Dept, 
of Computer Science, University of Queensland, 1991. 

4. International Electrotechnical Commission, lEC 61508: Functional Safety of Elec- 
trical/Electronic/Programmable Electronic Safety -Related Systems, 1999. 

5. K. Lano, J. Bicarregui, and A. Evans. Structured Axiomatic Semantics for UML 
Models, ROOM 2000 Proceedings, to appear in Electronic Workshops in Computer 
Science, Springer- Verlag, 2000. 

6. K. Lano, K. Androutsopoulos, D. Clark, Structuring and Design of Reactive Sy- 
stems using RSDS and B, EASE 2000, to appear in LNCS, Springer- Verlag, 2000. 

7. C. Lewerentz, T. Lindner (eds.), Formal Development of Reactive Systems, LNCS 
Vol. 891, Springer- Verlag, 1995. 

8. A. Lotzbeyer, R Miihlfeld, Task Description of a Flexible Production Cell with Real 
Time Properties, EZI, Karlsruhe, 1996. 

9. F. Mejia, Formalising Existing Safety- Critical Software, FMERail Workshop No. 
2, London, UK, 1998. http://www.ifad.dk/Projects/fmerail.htm. 

10. E. Sekerinski. Graphical Design of Reactive Systems. 2nd International Conference 
on B, Lecture Notes in Computer Science, Springer Verlag, pages 182-197, 1998. 

11. H. Treharne, S. Schneider, Using a Process Algebra to Control B Operations, IFM 
’99, Springer- Verlag, 1999. 

A Formal Definition of SRS Statechart Notation 

State machines Restricted statecharts are defined in terms of state machines. For- 
mally, a state machine A consists of sets StatesA of states, EventsA of events and 
TransA of transitions. Each transition t oi A has a source state sourccA^t) £ StatesA, 
a target state targetAit) G StatesA, and an event eventA{t) € EventSA which denotes 
the event which may trigger the transition. 

There is a designated initial state initA- It is denoted as the target of a virtual 
transition arrow which has no source. 

States will be classified as safe or hazard states (accidents are events which may 
occur from hazard states). Disjoint sets SafcA, HazA denote the respective sets of safe 
and hazard states of A. 




Invariant-Based Synthesis of Fault-Tolerant Systems 



55 



State machine abstraction morphisms and refinements Stepwise refinement of re- 
active systems involves the successive elaboration of controller responses and details of 
the equipment under control. A rehnement C of a system A must be able to respond 
to all requests that A can, in a compatible way. This is formalised by the concept of 
an abstraction morphism. 

A state machine abstraction morphism a : C ^ A is a function from Statesc U 
Eventsc U Transc to StatesA U EventSA U TransA which maps each state s of C to a 
state cr(s) of A, each event a of (7 to an event (j(a) of A, and each transition t of C 
to a transition a{t) of A, such that: 

1. If the event of t in (7 is a, then the event of a{t) in A is a{a): eventA{n{t)) — 

a{eventc[t)). 

2. sourceA{a{t)) = a(sourcec{t)) and target A{<j{t)) = a{targetc{t)). 

3. a{initc) = initA- 

Such a morphism preserves the behaviour of a state machine in the sense that all 
transitions t : s — >■ s' in (7 for a given event a must correspond to some abstract 
transition a{t) : <t(s) — >■ o-(s') in A for the same event. That is, no essentially new 
behaviour can be introduced for a. 

(7 is a refinement of A via u if cr is a state machine abstraction morphism which is 
an isomorphism on the set of events of (7 and A: no new events are introduced and no 
abstract events are deleted. 

A refinement a is adequate if every state s of A has some corresponding state of (7: 

V s : StatesA -3s': Statesc ■ cr{s') = s 

and if every event a which has an abstract transition from some s also has a corre- 
sponding concrete transition from each corresponding state: 

V s : States A ; t : Trans a ■ 

event A{t) = a A source A{t) = s => 

(V s' : Statesc ■ n(s') = s =i> 

3 t' : Transc • cr{eventc{t')) = a A cr(f') = t A sourcec(t') = s') 

This ensures that (7 can react to a in every situation that A can. By definition of 
refinement, the effect of this reaction is also consistent with that given in A. The 
adequacy condition is similar to the “preservation of precondition” requirement of Z 
refinement [3]. 

An abstraction morphism a ■. (7 — >■ A is safety preserving if s € Haze implies 
a{s) £ HazA- 

Such a (7 is referred to as a safety abstraction. Likewise a safety refinement is defined 
as a refinement which additionally preserves safety. 

OR and AND composition Statecharts are defined as for state machines, with the 
additions of the two constructions of nesting of statecharts within states (OR composi- 
tion) and concurrent (AND) composition. In the SRS formalism, these constructs can 
be eliminated successively to reduce a statechart to a state machine (a statechart with 
only basic states) [5]. 

An OR state s of a statechart A has an enclosed statechart smachA^s) and otherwise 
has the same properties as a state machine state. StateSsmachji(s) are included in States a 




56 



K. Lano et al. 



and similarly Trans smack ^{s) are included in Trans a and Events smack ^(s) in Events a - The 
graphical notation for an OR state s is a state enclosing smaehA{s) . 

An AND state s is a parallel (AND) composition A | of two OR states. The states 
of A \ B are effectively pairs {a, b) of states a of A and b of B. The graphical notation 
for A I B consists of A and B separated by a dashed line. StatesA\B ~ States a^-) States b 
and similarly for Trans a\b and Events a\b- 

Modules and subsystems In SRS statecharts all systems are described in terms of 
modules: an OR state containing only basic and OR states. A system description S is 
specified by the AND composition Mi | . . . | Mm of all the modules contained in it, 
modules(S) = {Mi, . . . , M„}. Such an S is termed a module system. 

Each module M in S' has a set receiverss{M) of modules in S, which are the only 
modules of S which it can test or send events to: transitions t of M may refer to 
the state of modules M' G receivers s[M) via a guard condition condition m {t) built 
from logical combinations of formulae in x or not{in x) where a; is a state of M', 
representing that M' is in state x or not, respectively, t may have a generation sequence 
generationsM{t) : seq(lJM'e.eceW„(M) Eventsw)- 

The collection of all generated events of a statechart M is Genu = Ut-ivonsM 
ran{generationsM(i)). 

GeninffEventsM = <3 for a module M in S, and Genm C UM^e_ ,s(M) EventSM'- 
receiverss is acyclic: M ^ receiversg\{MY\ where receiversg is the transitive closure 
of receiverss considered as a relation. For each module, M, the set receiverss[{M}] is 
termed the subsystem S' defined by M. M is then the outer module of S' . 

Additional constraints may be placed on the form of receiverss , for example, re- 
quiring that it is a tree: no two modules have a common receiver. This corresponds to 
the purely hierarchical structure of subsystems within a B development [9]. 

The set of external events Extg of a module system S is the set 
UMemoduies(s) ^ventsM ~ [j M emoduics(s) ■ Normally these are the events of the 

sensor modules and the controller at the top of the controller hierarchy. 

For module systems, the external events of the system must not change in a refi- 
nement. That is Exts = Exts' where S' is the refinement of S. 

Definition If s, s' are states of a statechart C, the notation s Qc s' denotes that 
s = s' or s' is an OR state, smachc(s') = M and s G StatesM- 

Definition For any state s of a statechart O, Sinu denotes the initial state of s: Sinu = 
s if s is basic, Sinit = (s')init if s is an OR state with initial state s', and Sinit = 
{siinit, ■ ■ ■ , Sninit) if s is au AND state s = si | . . . | s„. Sinit is therefore always a basic 
state or tuple of basic states. 

Definition If M contains only basic states, then StateSfiaUen(M) = Statesu- 
Otherwise Statesgtan^^^^i^'^ = a ■statesM States^’", where, if s is a state of a sta- 
techart M, then, if s is basic: States'^’" = {s}. If s is an OR state, E = 

smachM{s): States^’" = Ua'-Siaiesg States^’" . If s is an AND state, s = si | s2: 

States’^’" = States’^’"^ x States’^ . 

Definition For a tree-structured module system S, flatten{S) , the flattened version of 
S, can be constructed by systematic removal of OR and AND states, with appropriate 
re-targeting of transitions [2,5]. It is a state machine. 




Invariant-Based Synthesis of Fault-Tolerant Systems 



57 



Definition A (SRS) statechart abstraction morphism a from a concrete statechart C 
to an abstract statechart A maps states of C to those of A, events of C to events of A 
and transitions of C to those of A such that: 

1. Source, target and trigger of transitions are preserved: 

source A{<y{t)) = a(sourcec(t)) 
target A{(r{t)) — a{targetc{t)) 
eventA{(j{t)) — a{eventc{t)) 

2. If s is an OR state of C then cr(s) is an OR state or basic state of A, and initial 
states are preserved: 

0'{Sinit') — (n'(s))mit 

3. a preserves C: s Re s' => a{s) This means that states within 

particular modules M of C remain in the abstraction a{M) of M in A. 

4. The module structure of (7 is preserved, ie: receiversA{<r{M)) = a[receiversc{M)]. 

5. Conditions and generations are preserved: 

conditionA{o'{t)) = a{conditionc{t)) 
generations A (crit)) <C a o generationsc{t) 

where si <C S 2 denotes that si is a (possibly non-contiguous) subsequence of S 2 - 

Further conditions are needed in order that flatten{a) can be dehned: 

1. (T is the union of separate morphisms ai : MCi — >■ MAi between corresponding 
modules of C and A, a{MCi) = MAi, with these morphisms being compatible on 
events: 

a € EventSMCi A (3 G EventSMCj (o'i(o) = ^ a = (3) 

a G EventSMCi O Eventsucj erfia) = crj{a) 

where i j- 

2. Each of the cr; are adequate, and every module of A must correspond to some 
module of C. 

3. X \= E in C {x satisfies E, where ® is a state of C and E a transition condition) 
implies that o{x) |= er{E), for each such condition. 

This is ensured in particular for positive E, ie, for E containing no occurrences of 
-1 or 

Given these constraints, if S and S' are tree-structured module systems, connected 
hy a : S' ^ S , then there is an abstraction morphism r = flatten(a) : flatten{S') —>■ 
fl,atten{S) derived from a. This is proved by structural induction on S' . 

Similarly there are results showing that r is also adequate under these constraints, 
provided that cr(si) R^ rr(s 2 ) implies Si Re S 2 for states of C, the receivers structure 
of C and A are identical except for new, complete, leaf modules in C, and there are 
no generations on initial transitions of (7 or T. In addition it can be shown that if a is 
safety-preserving, so is r. 




Modeling Faults of Distributed, Reactive 

Systems* 




Max Breitling 

Institut fiir Informatik 
Technische Universitat Miinchen 
D-80290 Miinchen, Germany 
http: //www. in.tum.de/~breitlin 




Abstract. Formal methods can improve the development of systems 
with high quality requirements, since they usually offer a precise, non- 
ambiguous specification language and allow rigorous verification of sy- 
stem properties. Usually, these mainly abstract specifications are ideali- 
stic and do not reflect faults, so that faulty behavior - if treated at all - 
must be specified as part of the normal behavior, increasing the comple- 
xity of the system. It is more desirable to distinguish normal and faulty 
behavior, making it possible to reason about faults and their effects. 

In this paper the notions of faults, errors, failures, error detection, error 
messages, error correcting components and fault tolerance are discussed, 
based on a formal model that represents systems as composition of inter- 
acting components that communicate asynchronously. The behavior of 
the components is described by black-box properties and state transition 
systems, with faults being modeled by modifications of the properties or 
transitions. 



1 Introduction 

One of the goals of software engineering is the development of correct software. 
Correctness needs to be defined, usually by a specification that describes the 
system to be constructed in a precise and unambiguous way. The most rigo- 
rous approach to establishing the correctness of the system under consideration 
are formal methods, which allow us to prove that the system indeed meets its 
specification. 

Nevertheless, systems developed using formal methods can still fail: subcom- 
ponents can be unreliable, some (possibly undocumented) assumptions turn out 
to be invalid, or the underlying hardware simply fails. It can be argued that 
this was caused by mistakes introduced during the formal development, e.g. by 
making too idealistic assumptions about the environment. In this paper, we ex- 
plore another approach: We embed the notion of a fault in the context of formal 
methods, targeting two major goals: 

— Support for the development of fault-tolerant systems, requiring a precise 
definition of faults and errors. 

* This work is supported by the DFG within the Sonderforschungsbereich 342/A6. 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 58-69, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 



Modeling Faults of Distributed, Reactive Systems 



59 



— Reduction of the complexity of formal development by allowing a methodolo- 
gical separation of normal and faulty behavior. After the fault-free version of 
the system is developed, the possible faults and appropriate countermeasures 
can be integrated seamlessly in the system. 

To model faults already at the level of specifications could sound contradictory, 
because the specification is intended to describe the desired behavior, and no- 
body wants faults! But in an early development phase it is normally unknown 
which faults can occur in a system, simply because it is even still unknown 
what components will be used and how they can fail. Nevertheless, certain kinds 
of faults can be anticipated already during system development in general, as 
e.g. by experience or for physical reasons: a transmission of a message can, for 
instance, always fail. If these faults can be treated already at an abstract level by 
a general fault handling mechanism, it is sensible to describe the faults already 
within the specification, and not postpone it to a later phase in the development 
process. 

In this paper, we enrich the model of FOCUS with the notions of faults, errors, 
failures and fault-tolerance and discuss their connections and use. Since FOCUS 
offers methodological support for specifying and verifying reactive systems in- 
cluding a formal foundation, description techniques, a compositional refinement 
calculus and tool support, we expect benefits when FOCUS is combined with re- 
sults from the area of fault-tolerance. While most other approaches are concerned 
mainly with foundations of fault tolerance, we try to keep an eye on the applica- 
bility for users that are not experts in formal methods. Therefore, our long-term 
target - not yet reached - are syntactic criteria for certain properties instead of 
logical characterizations, diagrams instead of formulas, and easy-to-use recipes 
how to modify systems to their fault-tolerant versions. 

In the next section, we describe very briefly our system model of distributed, 
interacting, reactive components. In Section 3 we introduce faults as modifica- 
tions of systems. Section 4 contains a discussion how the formal definitions can 
be used to describe fault assumptions, and detect, report and correct faults. In 
the last section we conclude and discuss future work.^ 

2 System Model 

Our system model is a variant of the system model of FOCUS [5,6]. A system 
is modeled by defining its interface and its behavior. The system’s interface is 
described by the (names of the) communication channels with the types of the 
messages that are sent on them. The (asynchronous) communication along all 
channels is modeled by (finite or infinite) message streams. The behavior of a 
system is characterized by a relation that contains all possible pairs of input and 
output streams. This relation can be described in (at least) two ways on different 
abstraction levels. 

^ Due to lack of space, all examples are omitted but can be found in an extended 
version of this paper on the author’s homepage. 




60 



M. Breitling 



A Black Box Specification defines the behavior relation by a formula <P with 
variables ranging over the input and output streams. The streams fulfilling these 
predicates describe the allowed black-box-behavior of a system. We can use se- 
veral operators to formulate the predicates, as the prefix relation C, the conca- 
tenation of streams and the expression s.k for the k-th element of a stream 
s, to mention just a few [5]. 

A more operational State-Based View is offered by State Transition Systems 
(STS) that describe the behavior in a step-by-step manner: Depending on the 
current state, the system reads some messages from the input channels, and 
reacts by sending some output and establishing a successive state. A STS is 
defined by its internal variables with their types, an initial condition, a set T 
of transitions and of environment transitions, precisely formalized in [3]. 
The possible behaviors of a system are described by the set {{S)) containing all 
executions ^ of the system. Executions are defined in the usual way as sequences 
of states a. A STS can be defined in a graphical or tabular notation. 

Both views on systems can be formally connected: An infinite execution of a 
STS defines least upper bounds for the message streams that are assigned to the 
input/output channels, and therefore establishes a black-box relation. In [3,4] 
the language, semantics and proof techniques are investigated in detail. 

Focus offers notions for composition and refinement supporting a top-down 
development of systems. The behavior of a composed system 5i 0 S 2 can be 
derived from the behavior of its components. The interface refinement 5i 
S 2 states that the executions of ^2 are also executions of 5i with modifications 
at the interface described by the relations Ri, Rq- Compositionality ensures that 
refining a system’s component means refining the overall system. 

3 Modifications and Faults 

Intuitively, faults in a system are connected with some discrepancy between an 
intended system and an actual system. To be able to talk about faults, their 
effects and possible countermeasures, we need a clear definition of the term 
fault. We suggest to identify faults with the modifications needed to transform 
the correct system to its faulty version. 

In this section, we define modifications of systems, both for the black-box 
and the operational view, and base the notions of fault, error and failure on 
these modifications. 

3.1 Modifying a System 

In the process of adapting a specified system to a more realistic setting containing 
faults, we have to be able to change both the interface and the behavior. 

Interface modifications We allow the extension of a type of a channel and the 
introduction of new channels. The behavior stays unchanged if the specification 
is adjusted so that it ignores new messages on new input channels, while it may 
behave arbitrarily on new output channels. For development steps towards a 




Modeling Faults of Distributed, Reactive Systems 



61 



fault-tolerant system it is normally expected that the behavior does not change 
in the case faults do not occur. Therefore we are interested in criteria for be- 
havior maintenance that are easy to be checked. For interface modifications, 
these criteria can be defined syntactically according to the description technique 
used, as e.g. black-box formulas, tables or state machines. We do neither allow 
the removal of channels nor a type restriction for a channel, because this could 
easily lead to changes of the behavior. A change of the types for the channels 
follows the idea of interface refinement. Under certain conditions, these changes 
maintain (the properties of) the behavior. In this paper, we will not investigate 
this topic. 

Behavior modifications A fault-affected system normally shows a different beha- 
vior than the idealistic system. Instead of describing the fault- affected system, 
we focus on the difference of both versions of the system and suggest a way to 
describe this difference for black-box views and state machines. 

Having <P as the black-box specification of the fault-free system, we need to 
be able to strengthen this predicate to express further restrictions, but also to 
weaken it to allow additional I/O-behaviors. We use a pair of formulas Ai = 
{d>E,d^F) and denote a modified system by 

<PaM (read: (P modified by M) 
whose black-box specification is defined by 

(^ A <Pe) V <Pf 

The neutral modification is denoted by (true, false), and the modification towards 
an arbitrary 'P is expressed by (false, !F). 

For a state-based system description, we express modifications of the behavior 
by modifications of the transition set (as e.g in [1,9,12]). Obviously, we can add 
or remove transitions and define a behavior- modification by a pair (E, F) of 
two sets of transitions. The set E contains transitions that are no longer allowed 
in an execution of the modified system. The set F contains additional transitions. 
The transitions in F can increase the nondeterminism in the system, since in 
states with both old and new transitions being enabled, the system has more 
choices how to behave. We can use F to model erroneous transitions the system 
can spontaneously take. The executions of a modified system are defined by 

{{SAM)) = U I i^-k, iCk + 1)) G ( T \ A) U F U T^} 

i.e. a non-environment transition has to be in F or in T but not in F. In this 
formalism, (0, 0) is the neutral modification, and choosing F to contain all 
transitions and F as arbitrary set of transitions shows that this formalism is 
again expressive enough. 

It is an interesting but open question if and how both notions for modifica- 
tions can be connected. If ^ is a property of a STS S, and both are modified 
in a similar way, then <Pa{(Pet^f) should be the modified property of the mo- 
dified system Sa{E, F). Similar approaches and partial results are discussed in 
[2,7,13]. 




62 



M. Breitling 



3.2 Combining Modifications 

To explore the effect of multiple modifications, we define the composition of 
modifications. For black-box specifications, the operator -|- combines two modi- 
fications of a system {i = 1,2), assuming => and to 

one modification by 

^ {1>\ A V $1) 

We reuse the operator -|- for transition systems, and define for {Ei, Fi), assuming 
i?i n F 2 = 0 and i ?2 n Fi = 0, the combination 

(Fi, Fi) + (F 2 , F 2 ) = (Fi U F 2 , Fi U F 2 ) 

The assumptions avoid confusion about executions resp. transitions that are 
added by one modification but removed by the other, and asserts the following 
equalities, with S representing ^ resp. S\ 

S/\{Mi+M2) = (SaMi)aM2 = (SaM2)aMi 

We can use this operator to express combinations of faults for defining the notion 
of fault-tolerant systems. 

For a composite system 5 = 5^ 0 5^ we can derive the modification of this 
system from the modifications of its constituents, and can calculate the impact 
of a fault of a component upon the overall system. For black-box specifications, 
we define the derived modification of the system by 

^E = ^E A <P\ A ^%) V A ^p) V {^p A ^1) 

For modifications of the transition sets of the components, we can define Ai = 
{E, F) with (a denotes the pairwise conjunction of elements of both sets) 

F = Fi A U T( A F 2 and F = Fi A F| U A F 2 

With the same assumptions for the component’s modifications as above, this 
results for both formalisms in 

SaM = (SiAMi) 0 (S 2 AM 2 ) 



3.3 Faults, Errors and Failures 

In the literature the meaning of the terms fault, error and failure is often de- 
scribed just informally (e.g. [10,11]). In our setting, we can define these notions 
more precisely. 

The faults of a system are the causes for the discrepancy between an inten- 
ded and actual system. Therefore, it makes sense to call the transitions of a 
modification Ai the faults of a system. What is called a fault of a system cannot 
be decided by looking at an existing system alone; this normally depends on 




Modeling Faults of Distributed, Reactive Systems 



63 



the intended purpose of the system, on an accepted specification and even on 
the judgment of the user or developer. What one person judges as fault, the 
other calls a feature. The definition of modifications given in the previous sec- 
tions is intended to offer a possibility to document that decision, and explicitly 
represent the faults in a modified system. Of course, the modified system could 
be described by one monolithic specification without reflecting the modifications 
explicitly, but it is exactly this distinction between “good” and “bad” transitions 
that allows our formal definitions. 

A fault can lead to an erroneous state, if an existing faulty transition is taken 
during an execution of the system. We define a state a to be an error (state) if 
this state can only be reached by at least one faulty transition. The set of errors 
of a system S under the modifications Ai = {E, F) is defined as 

ERROR{S,M) = {a I Vfc G G ((5a7W)) • 

^,k = a^3l<k* + 1)) G A} 

Note that all unreachable states are error states, and the set E enlarges the 
set of unreachable states. The set of correct states can be defined as the set of 
valuations that can be reached by normal transitions (in T) only. As long as 
we do not require F (1 T = 0, it is possible that states are both correct states 
and error states. We cannot sensibly define errors for the black-box view, since 
neither states nor internals do exist in that context. 

A failure is often defined as a visible deviation of the system relative to some 
specification. Since we can distinguish the inside and outside of systems, we can 
also reflect different visibilities of errors. Our definition of a failure depends on the 
kind of specification: If we regard a black-box specification as the specification 
of a system, a failure occurs in a state a if the property gets violated in that 
state. But we can also define a failure if the unmodified STS S is understood as 
specification, and SaA4 as faulty system. An error state a is additionally called 
a failure if all states with the same visible input/output behavior are error states: 

FAILURE{S,M) = {a \ \/ (3 • (3 '= a ^ (3 G ERROR{S,M)} 

Two valuation a and /3 coincide on a set of variables V, if they assign the same 
value to all variables in V, i.e. a = j3 v G V • a.v = j3.v. 



3.4 Internal vs. External Faults 

Up to this point, we focused on internal faults: The behavior deviation resp. 
the faulty transitions occurred inside the system. But a system can also suffer 
from faults taking place outside a system, i.e. in its environment. A discussion 
of failures of the environment requires explicit or implicit assumptions about its 
behavior. An explicit assumption can be formulated in the context of black-box 
views by a formula that describes the assumed properties of the input streams. 
If this assumption is not fulfilled, the system’s behavior is usually understood 
to be not specified so that an arbitrary, chaotic behavior may occur. We think 




64 



M. Breitling 



this situation relates to an external fault, and should be treated by a reasonable 
reaction of the system instead of undefined behavior. We need further methodo- 
logical support offering notions of refinement for these cases: Given an assump- 
tion/guarantee specification AjG, we need to be able to weaken A and adapt G 
so that the original behavior stays untouched if no external faults occur, but a 
sensible reaction is defined if they do. 

The type correctness of the input messages can be regarded as another ex- 
plicit assumption about the environment. If the interface is changed so that new 
messages can be received, we have to refine the behavior of the system in an 
appropriate way. 

If the system is specified by a STS, but no explicit environment assumptions 
are defined, we can nevertheless try to find implicit assumptions. If the system 
is in a certain state, it is normally expected that at least one of the transitions 
should be eventually enabled. It some cases, it can indeed be meant that a system 
gets stuck in certain situations, but normally a weak form of liveness is wanted: 
The inputs should finally be consumed, and a state where a system gets stuck 
is a kind of error state with invalidated liveness. We regard these questions and 
the distinction of internal and external faults as an interesting area for future 
research. 

4 Dealing with Faults 

Introducing a formal framework for formalizing faults needs to be accompanied 
with some methodological advice how the formalism can be used. In this section, 
we discuss how fault occurrences and dependencies between fault models can be 
expressed by virtual components, mention requirements for error detection and 
the introduction of error messages and define fault-tolerance. 



4.1 Refined Fault Models 

To describe a system with certain faults, we can modify a system accordingly by 
adding fault transitions. In specific cases, these modifications could change the 
behavior too much, since these transitions can be taken whenever they are ena- 
bled. Sometimes, we want to express certain fault assumptions that restrict the 
occurrence of faults. For example, we would like to express that two components 
of a system can fail, but never both of them at the same time, or we want to 
express probabilities about the occurrence of faults, e.g. state that a transition 
can fail only once in n times, for some n. 

To be able to formalize these fault assumptions, we suggest to introduce 
additional input channels used similar to prophecies. The enabledness of the 
fault transitions can be made dependent on the values received on these pro- 
phecy channels. We can then add an additional component that produces the 
prophecies that represent the fault assumption. During the verification, these 
virtual components and prophecy channels can be used as if they were normal 
components, even though they will never be implemented. 




Modeling Faults of Distributed, Reactive Systems 



65 



4.2 Detecting Errors 

Error detection in our setting consists, in its simplest case, of finding an expres- 
sion that is true iff the system is in an error state. The system itself must be able 
to evaluate this expression, so that this expression can be used as a precondition 
for error-correcting or -reporting transitions. 

An easy way to detect errors is a modification of the fault transitions so 
that every fault transition assigns a certain value to an error-indicating variable. 
For example, a fault transition can set the variable fault to true, while normal 
transitions leave this variable unchanged, as suggested in [12]. But this approach 
assumes the fault transitions to be controllable, which is in general not the case: 
The faults are described according to experiences in the real world, e.g. messages 
are simply lost from a channel without any component reporting this event. We 
could change this lossy transition to one that reports its occurrence, but this 
new variable fault may only be used in proofs for investigating the correctness 
of the detection mechanism, but this is not a variable that is accessible by the 
system itself. We have to deal with given faults described by modifications that 
we must accept untouched, but nevertheless we want to detect them. 

We suggest a way to handle errors that can be detected by finding inconsi- 
stencies in the state of the system. The consistency can be denoted as a formula 
S' that is an invariant of the unmodified system. It can be proved to be an in- 
variant by the means of [3]. We can then remove all transitions with -< 'P as, 
precondition (via E) and add a new error reacting transition with an intended 
reaction (via F). Normally, a system occasionally contains transitions that are 
enabled if -> P, simply because a set of transitions can be indifferent to unspe- 
cified properties. Such a modification does not change the original system, but 
allows the specification of reactions, e.g. by sending an error message. 

This approach is conceptually the easiest way, since error detection is imme- 
diate, but it is not always realistic. In [I] a more general approach is presented, 
that also allows delayed error detection. We have to integrate this idea also 
in our stream-based setting, being specially interested in a notion of a delayed 
detection that still occurs before an error becomes a failure. 



4.3 Error Messages 

Once we enabled a system to detect an error, we want it to react in an appropriate 
way. If errors cannot be corrected, they should at least be reported. Sending and 
receiving of error messages has to be integrated in the system without changing 
its fault-free behavior. 

In Section 3.1 we already saw that by adding an additional output channel, 
with arbitrary messages sent, the behavior will only be refined. So, extending a 
system to send error reporting messages is easy: We can add a transition that 
sends an error message in the case an error is detected while it leaves all other 
variables in V unchanged, and we refine the other transitions to send no output 
on this channel. 




66 



M. Breitling 



We also want to react to error messages from other components. Therefore, 
we must be able to extend a component by a new input error message channel, 
and adapt the component to read error messages and react to them. A further 
transition in the system that reads from the new channel and reacts to it can 
easily be added while other transitions simply ignore the new channel. 

4.4 Correcting Faults 

We described ways how a system can be modified to contain anticipated faults 
already at the abstract level of specifications. The deviations of such a modified 
system can show different degrees of effect: The effects of the faults are harmless 
and preserve the properties of a specification, or the faults show effects that vio- 
late the specification, but they are correctable, or the faults lead to failures that 
are not correctable. The first case is of course the easiest since no countermea- 
sures have to be taken for the system to fulfill its specification. In the last case, 
faults can only be detected and reported, as described in the previous sections. 

For correctable faults the system usually must be extended by mechanisms 
that enable the system to tolerate the faults. Several mechanisms are known, 
implementing e.g. fail-stop behavior, restarts, forward or backward recovery, 
replication of components, voters and more. All of these are correctors in the 
sense of [1]. 

A methodology supporting the development of dependable systems should 
offer patterns that describe when and how these mechanisms can be integrated 
in a specified system, together with the impact on the black-box properties. For 
example, a fail-stop behavior can be modeled by introducing a new trap state 
that was not yet reachable before, and that does not consume or generate any 
messages, while safety properties are not compromised. 

There is a special case of (local)correction of faults that can be done by new 
components in a system that catch the effect of faults of a component before 
they spread throughout the system. These new components, that we call drivers, 
are placed between the fault-affected component and the rest of the system. 
Depending on the characteristics and severity of the faults, the driver controls 
just the output of the component, or controls the output with the knowledge of 
the input, or even controls input and output, as showed in the following figure. 
The last variant is the most general one, and could tolerate arbitrary failures by 
totally ignoring the faulty component and simulating its correct functionality. 




Since we already know how to specify components and how to compose compo- 
nents to systems, fault correction can be integrated as an ordinary development 
step, so that results concerning methodology [5] , tool support [8] and proof sup- 
port [3,4] can be used. 






Modeling Faults of Distributed, Reactive Systems 



67 



4.5 Fault-tolerance 

Usually, fault-tolerance is interpreted as the property of a system to fulfill its 
purpose despite the presence of faults in the system, but also in their absence (as 
pointed out e.g. in [9]). In our formalism, this could be expressed by the following 
monotonicity property, stating that all partial modifications of a system should 
maintain a certain property. 

• E' C E A F' C F Sa{E',F')\=^ 

We think this condition is too strong, since too many partial modifications 
have to be considered. Assume a fault - being tolerable - that can be modeled 
by a change of a transition, expressed by removing the old and adding the new 
transition. If we just add the new one, but do not remove the old transition, we 
have a partial modification that could never happen in practice but results in a 
system with intolerable faults. Partial modifications are too fine-grained if they 
are based on single transitions. 

We suggest that a statement about fault-tolerance must be made explicit by 
specifying the faults and combinations of faults for which the system should have 
certain properties. As opposed to other approaches [9,12], a modification (with 
a nonempty E) can change a system so that it cannot show any execution of the 
original system. So, if a property is valid for the modified system, it is possibly 
not valid for the unmodified system. 

In our setting, explicit fault-tolerance can be expressed by generalizing our 
expressions to allow sets of modifications. The following expression is defined to 
be valid if V z • SAMi \= <P. 

Sa{Mq,Mi,M2,.-.} 

For a statement about fault-tolerance, the empty modification (0, 0) has to be 
contained in the modification set, and the desired combinations of modifications 
must be explicitly included. The induced number of proof obligations needs 
further methodological support. 

5 Proving Properties 

The additional effort imposed by the use of formal methods for formalizing a 
system is rewarded by the possibility to prove that the systems have certain 
properties. While many formalisms offer this possibility theoretically, it is also 
important to offer methodology to find the proofs. In [3,4] we presented a way 
of proving properties for our system model, using proof rules, quite intuitive 
diagrams and tool support. 

It is crucial for a successful methodology that proofs can be found with 
reasonable effort. For fault-tolerance, it is desirable that proof obligations can be 
shown in a modular way. Results for an unmodified system should be transferred 
to modified results for the modified system. If properties of the correct system 




68 



M. Breitling 



are already shown, this result should not be invalidated totally by modifying the 
system so that the verification has to start again from scratch. The existing proof 
should only be adapted accordingly, reflecting the modifications, using already 
gained results. 

So it seems to be an interesting research topic to And notions for modifying a 
proof. Since a proof can be represented by a proof diagram, it can be promising 
to investigate modifications of proof diagrams. If a transition is removed (by E), 
a safety diagram stays valid also without this transition. In a liveness diagram, 
new proof obligations emerge in this case, since the connectivity of the graph 
must be checked again. Adding a transition via F will - in most cases - destroy 
the validity of a safety diagram, and will even introduce new nodes. These new 
nodes have to be checked relative to all other transitions of the system, and they 
will also appear in the liveness diagram, leading to a bunch of additional proof 
obligations there. Nevertheless, parts of the diagram stay unchanged and valid, 
representing a reuse of the existing proof. 

6 Conclusions 

This paper discusses how faults can be modeled in the context of distributed 
systems, composed of components that interact by asynchronous message pas- 
sing. We have shown how the behavior of such systems can be specified, using 
an abstract black-box view or an operational state-based view. Faults of a sy- 
stem are represented by the modifications that must be applied to the correct 
system to obtain the faulty system. Modifications can change both the interface 
and the behavior. For a modified system we can characterize its error states and 
failures. Once the faults resp. modifications of a system are identified, the ways 
how errors can be detected, reported, corrected and tolerated are also discussed, 
mostly informally, in this paper. 

Future Work The topic of the formal development - including the specification, 
verification, and stepwise refinement - of fault tolerant systems is not yet explo- 
red to a satisfying degree with concrete help for developing systems with faults. 
It is a challenging task to combine various results found in literature with this 
paper’s approach based on message streams and black-box views. An ideal for- 
mal framework combines the benefits of different approaches, and offers solutions 
to several aspects as formal foundation, methodology and verification support. 

For a framework to be formal, precise definitions for all notions must be de- 
fined. We need a formal system model that is enriched by notions for faults and 
their effects, errors, failures, changes of interfaces and internals, fault assump- 
tions, adaption of properties to modifications of the system, composition and 
refinement of faults. But a language to express statements about fault-affected 
or -tolerant systems is not enough, some methodological advice for its use is also 
needed, offering ideas how to use this language: When and why should faults 
be described, how can we refine a system to stay unchanged in the fault-free 
case, but improve its fault tolerance in the presence of faults? Formal methods 




Modeling Faults of Distributed, Reactive Systems 



69 



allow for formal verification. This has to be supported by suitable proof rules, 
but even this is not enough: We also need description techniques for proofs and 
tool support for generating proof obligations and finding and checking proofs. 
Finally, only convincing case studies are able to show a recognizable benefit of 
the idea to formally develop fault-tolerant systems. 

References 

1. Anish Arora and Sandeep Kulkarni. Detectors and correctors: A theory of fault- 
tolerance components. IEEE Transactions on Software Engineering, 1999. 

2. Max Breitling. Modellierung und Beschreibung von Soll-/Ist-Abweichungen. In 
Katharina Spies and Bernhard Schatz, editors, Eormale Beschreibungstechniken 
fiir verteilte Systeme. FBT’99, pages 35-44. Herbert Utz Verlag, 1999. 

3. Max Breitling and Jan Philipps. Step by step to histories. In T. Rus, editor, 
AMAST2000 - Algebraic Methodology And Software Technology, LNCS 1816, pages 
11-25. Springer, 2000. 

4. Max Breitling and Jan Philipps. Verification Diagrams for Dataflow Properties. 
Technical Report TUM-I0005, Technische Universitat Miinchen, 2000. 

5. Manfred Broy and Ketil Stolen. Specification and Development of Interactive Sy- 
stems - FOCUS on Streams, Interfaces and Refinement. Springer, 2000. To appear. 

6. Homepage of FOCUS, http://www4.in.tum.de/proj/focus/. 

7. Felix C. Gartner. A survey of transformational approaches to the specification and 
verification of fault-tolerant systems. Technical Report TUD-BS-1999-04, Darm- 
stadt University of Technology, Darmstadt, Germany, April 1999. 

8. Franz Huber, Bernhard Schatz, Alexander Schmidt, and Katharina Spies. Auto- 
Focus - A Tool for Distributed Systems Specification. In FTRTFT’96, LNCS 1135, 
pages 467-470. Springer, 1996. 

9. Tomasz Janowski. On bisimulation, fault-monotonicity and provable fault- 
tolerance. In 6th International Conference on Algebraic Methodology and Software 
Technology. LNCS, Springer, 1997. 

10. J.C. Laprie. Dependability: Basic Concepts and Terminology, volume 5 of Depen- 
dable Computing and Fault-Tolerant Systems. Springer, 1992. 

11. P.A. Lee and T. Anderson. Fault Tolerance - Principles and Practice. Springer, 
second, revised edition, 1990. 

12. Zhiming Liu and Mathai Joseph. Specification and verihcation of recovery in asyn- 
chronous communicating systems. In Jan Vytopil, editor. Formal Techniques in 
Real-Time and Fault-Tolerant Systems, pages 137 - 166. Kluwer Academic Publis- 
hers, 1993. 

13. Doron Peled and Mathai Joseph. A compositional framework for fault-tolerance 
by specification transformation. Theoretical Computer Science, 1994. 

Acknowledgments I am grateful to Ingolf Kruger and Katharina Spies for inspiring 
discussions and their comments on this paper, and thank the anonymous referees for 
their very detailed and helpful remarks. 




Threshold and Bounded-Delay Voting in Critical 

Control Systems* 



Paul Caspi and Rym Salem 



Laboratoire Verimag (CNRS, UJF, INPG) 
{Caspi, Salem}@imag.fr 
http : //www- verimag. imag.fr 



Abstract. This paper investigates the possibility of implementing fault 
tolerance in control systems without using clock synchronization. We first 
show that threshold voting applies to stable continuous systems and that 
bounded delay voting applies to combinational systems. We also show 
that 2/2 bounded delay voting is insensitive to Byzantine faults and 
applies to stable sequential systems. It thus allows the implementation 
of hybrid fault tolerance strategies. 



1 Introduction 

It seems that, from the very early times of SIFT and FTMP [10,7], it was assu- 
med that highly fault-tolerant control systems had to be based on exact voting 
and clock synchronization. This led to the discovery of consensus problems and 
Byzantine faults [9] , and produced an important academic activity [6] , culmina- 
ting in the time-triggered approach to the development of these systems [8] . 

However, it also seems that, in practice, at least in the domain of critical con- 
trol systems, the use of clock synchronization is not so frequent [4,3] . We believe 
there are historical reasons for this fact, which can be found in the evolution 
of these systems: control systems formerly used to be implemented with analog 
techniques, that is to say without any clock at all. Then, these implementations 
smoothly evolved toward discrete digital ones, and then toward computing ones. 
When a computer replaced an analog board, it was usually based on periodic 
sampling according to a real-time clock, and, for the sake of modularity, when 
several computers replaced analog boards, each one came with its own real- 
time clock. Yet this raises the question of the techniques used in these systems 
for implementing fault-tolerance without clock synchronization, and of the well- 
foundedness of these techniques which equip, up to now, some of the most safety 
critical control systems ever designed (civil aircrafts [4], nuclear plants [3], etc.) 

This paper aims at providing some answers to this question. It will be or- 
ganized as follows: first we shall look at the architectural evolution from analog 
boards to distributed control systems. Then, we shall consider continuous con- 
trol. This will lead us to the concept of threshold voting, based on accuracy 
estimates: two signals are considered to disagree if they differ for more than 

* This work has been partially supported by the CRISYS Esprit Project 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 70-81, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Threshold and Bounded-Delay Voting in Critical Control Systems 



71 



the maximum normal error. Then we shall look at discontinuous functions and 
take boolean ones as an illustration. Here, combinational functions appear as the 
analog of continuous ones. Yet boolean calculations are perfectly accurate but 
the analogy is based on a space-time trade-off and yields bounded-delay voting: 
two signals are considered to disagree if they remain different for more than the 
maximum normal delay. Extending this scheme to apply to sequential functions 
by transforming them into combinational ones, that is to say by bounded-delay 
voting on the state, leads to problems of Byzantine faults. However, we show 
here that 2/2 voting schemes are not sensitive to Byzantine faults and behave 
properly. This provides us with self-checking schemes which can then be used to 
build fault tolerant strategies by means of selective redundancy. 

2 The Architectural Evolution 

Analog /digital communication: Aircraft control systems illustrate this evolu- 
tion which can also be found in many other fields of industrial control: starting 
from networks of analog boards, progressively some boards were replaced by 
discrete digital boards, and then by computers. Communication between the di- 
gital parts and the parts which remained analog was mainly based on periodic 
sampling (analog to digital conversion) and holding (digital to analog conver- 
sion), sampling periods being adapted to the frequency properties of the signals 
that traveled through the network. This allowed several technologies to smoothly 
cooperate. 

Serial links: this technique was suitable up to the time when two connected 
analog boards were replaced by digital ones. Then these two also had to commu- 
nicate and serial port communication appeared as the simplest way of replacing 
analog to digital and digital to analog communication as both can be seen as 
latches or memories. 

Field busses: then for the sake of optimization, these serial links are replaced 
by busses of several standards, (aircraft, industrial, automotive). Most of them, 
like Arinc429, just “pack” together several serial links, thus providing a kind 
of “shared memory” service, on top of which synchronization services can be 
implemented on need. 

Provision against Byzantine problems: In these very critical systems, Byzantine 
faults cannot be neglected and this is why some architectural precautions have 
to be taken in order to alleviate their consequences. For instance, these busses 
provide some protection against Byzantine problems [9], in the sense that they 
are based on broadcast: communication with several partners only involve one 
emission. Thus a failed unit cannot diversely lie to its partners. Then messages 
are protected by either error correcting and/or detecting codes which can be 
assumed to be powerful enough so that their failing be negligible with respect to 
the probabilistic fault tolerance requirements of the system under consideration. 




72 



P. Caspi and R. Salem 



Communication abstraction: according to what precedes, we can quite precisely 
state an abstract property of this kind of communication medium, which is a 
bounded delay communication property: 

Property 1 First, we assume that every process P is periodic with a period 
varying between small margins: Tpm <Tp< Tpm 

Property 2 Let Tsm and T^m be the respective maximal periods of the sender 
and of the receiver, and n the maximum number of consecutive failed receives 
whose probability is not negligible with respect to fault tolerant requirements (in 
the case of error correction, n = 1). Then the value Xr{f) known at any time 
t by the receiver of some signal Xs communicated by the sender is some Xs{t'), 
where \t — t'\ < T^m + nTsM 

Definition 1. A signal x' is a t bounded delay image of a signal x if there 
exists a monotonic (retiming) function t' : i?'*' — >■ such that 

Vt G R'^, 0 <t — t'{t) < T and x'{t) = x{t' (t)) 

3 Continuous Control 

3.1 Continuous Signals and Functions 

Most basic accuracy computations on continuous signals and functions over sig- 
nals can be based on standard uniform continuity: 

Definition 2. A function f G i?” — >■ R™ is uniformly continuous if there 
exists an error function pf G R^ -G R^ such that, for all x,x' G i?" and 
e G R+: \\x' - x|| < T]f{e) \\f{x') - f{x)\\ < e 

As function composition preserves uniform continuity, this easily allows the 
computation of bounds for systems made of uniformly continuous static func- 
tions fed by uniformly continuous signals through bounded delay networks. This, 
in turn, allows the computation of voting thresholds or more precisely the com- 
putation of periods such as to reach some accuracy or some voting threshold. 
For instance, \t' -t\<T = ^ \\f{a:{t')) - f{x{t))\\ < e 

3.2 Dynamical Systems 

However, static functions are quite rare in control algorithms, and one would 
rather find dynamical functions. There can be at least three different cases: 

The stable case consists of seeing dynamic systems as uniformly continuous 
functions over normed spaces of time signals. If we want to compute deterministic 
voting thresholds, an adequate norm is the Coo one: 

\\x' - a;||oo = sup{\\x{t) - x(t)\\ \ t G 

If / is an uniformly continuous dynamical function for this norm, fed with uni- 
formly continuous signals, we can easily reach the same bound as for static ones: 




Threshold and Bounded-Delay Voting in Critical Control Systems 



73 



Theorem 1. Ifx' is a t bounded delay image of x, if x is uniformly continuous 
with error function rj^, and if f is uniformly continuous with error function rjf, 
then T < ^ \\f{x') - f{x)\\oo < e 

In some sense, this uniform continuity is closely linked to stability: this could be 
rephrased by saying that stable dynamical functions behave like static ones. 




Fig. 1. A Control System 



The stabilized case Unfortunately, controllers are often not stable: a typical si- 
tuation is shown at figure 1. This closed- loop system computes the following 
functional equations: X = F{U, Ay^Y) and Y = G{X, Z), where F is the con- 
troller, G is the system under control, U is the vector of set values, X holds the 
control signals, Y contains the measurements from the process to be control- 
led that are fed back to the controller and Z contains the external perturbations 
that the system under control is subjected to. A is the unit delay for periodically 
sampled signals^: 

A"^Qx{nT) = if n = 0 
then xO 

else x((n — 1)T) 

In order to yield an overall stable system, the system under control often requires 
a controller that, when viewed in isolation of the control system, is not stable. But 
the overall stable system still computes X as an uniformly continuous function 
of signals U and Z. This allows bounds and thresholds to be computed as before. 

The unstable case However, even in the case of overall stable systems, it may be 
that some unstable variables are computed by the controller. Here, clock syn- 
chronization can be useful. But, in several cases, computing unstable values is 
a problem, regardless of fault tolerance, and some re-synchronization is algo- 
rithmically provided, so as to get stable computations, for instance thanks to 
Kalman filtering techniques. An example of such a situation can be found in [2] . 

^ Indicating the sampling period allows mixing both time continuous and sampled 
systems within the same equational framework; this feature is currently provided in 
design tools such as Simulink and Matrix^. 







74 



P. Caspi and R. Salem 



3.3 Threshold Voting 

Knowing bounds on the normal deviation between values that should be equal, 
easily allows the design of threshold voters. For instance, a 2/3 voter for scalar 
values can be written: 

uoter2/3(a;i, X2, 3:3, e) = if {\x2 — a^il < e)or(|x3 — X2I < e)or(|xi — X3I < e) 
then median{xi,X2, X3) 
else alarm 

where median{xi,X2,x^) = max{min{xi,x2),min{x2,x^),min{x^,xi)) 

Notations: In the sequel, algorithms are expressed by abstracting over time 
indices, thus, x\ = X2 means \/n € N : x\{nT) = X2{nT). 

4 Combinational Boolean Functions 

The case of combinational boolean functions closely looks like the static conti- 
nuous ones, but for the fact that no error function is available. We thus need to 
elaborate some notions which will help in recovering it. 

4.1 Uniform Bounded Variability 

Periodically sampling boolean signals is only accurate if those signals do not 
vary too fast with respect to the sampling period. Bounded variability (closely 
linked to non Zenoness [1]) is intended to capture this fact. However, it is not 
strong enough a property in the same way as continuity is not strong enough 
for computing error bounds. What is needed in fact is the uniform version of it, 
a possible definition, which only presents a small deviation from the one in [5], 
being as follows: 

Definition 3. A signal x G i?"*" — has uniform bounded variability if it 

is right- continuous and if there exists a discontinuity count function Cx G R'^ — >■ 
N such that any time interval of duration not larger than r contains a number 
of discontinuities not larger than Cx(r): 

Vti,t2 ■■ t2-ti <T ^ \{t\tG [ti,t2] A x{t~) yf x{t)}\ < Cx{t) 

where x{t~) represents the left limit at time t. Thus, any time t such that x(t~) ^ 
x{f) is a discontinuity point. 

This framework then allows us to define which boolean signals and tuples can 
be sampled without loosing too much information. 

Definition 4. A boolean signal x can be sampled with period T if Cx{T) = 1. 

This ensures that no value change will be lost by sampling. Another way, even 
more practical of defining the same thing is to define the minimum stable time 
interval: 




Threshold and Bounded-Delay Voting in Critical Control Systems 



75 



Definition 5. The minimum stable time interval of a boolean signal x 
is the largest time interval such that Cx(T^) = I' = sup{T \cx(T) < 1} 

Thus the sampling period should be smaller than T^. We can now relate bounded 
delays and minimum stable time interval: 

Theorem 2 (Bounded delay and minimum stable time interval). If x' 

is a T bounded delay image of x, if is the minimum stable time interval of x 
and if T <Tx then the minimum stable time interval of x' is: T^' =T^ — t. 

Remark: this result can be improved by considering both minimum Tm and 
maximum tm delays: T^/ = — {tm — Bn) Yet the extension to independent 

tuples is more difficult, since we cannot ensure that no value change will be 
lost. This is a general problem which arises in any periodically sampled system. 
In most cases, the best we can do is to choose the least period such that each 
component of the tuple is well sampled in the preceding sense. Nevertheless, we 
can relate the least minimum stable time interval of each component of a 
n-tuple X to some time interval where the value of the tuple remains constant: 

Theorem 3. Let X be an n-tuple such that each component has a minimum 
stable interval larger that . Then, in each time interval of duration larger than 
Ta;, there exists a time interval of duration at least where the value of the 
tuple remains constant. 



4.2 Bounded Variability and Bounded Delays: Confirmation 

However, delays do not combine nicely as errors do: the effect of errors on the 
arguments of a computation amounts to some error on the result. This is in 
general not true for delays: if the arguments of a computation come with distinct 
delays, the result may not always be a delayed image of the ideal (not delayed) 
result. 

Fortunately, some filtering procedures make it possible to change incoherent 
delays into coherent ones: these are known as confirmation procedures. Assume 
a tuple X' = {x',t = l,n} of boolean signals coming in one unit of period 
T from different units such that it is the image of an ideal tuple X through 
bounded delays: Vt = l,n, 0 <t — t'(t) < tx '■ x'^{t) = Xi{t[{t)) where we 

assume that all delays have the same upper bound tx- We consider the following 
confirmation function: 

Definition 6 (Confirmation function). 

confirm{X' ,nmax) = X" 
where X" , n = if A' yf Z\^gA' 
then Z\^gA", 0 
else if Z\q n < nmax — 1 

then A^qX" , AqU + 1 
else X', AqU 




76 



P. Caspi and R. Salem 



— this confirm function stores a counter n with initial value 0, and its previous 
output, with some known initial value XO, 

— whenever its input changes , it outputs its previous output and resets the 
counter, 

— else, if the counter has not reached nmax — 1, it increments it and outputs 
the previous output, 

— else it outputs the current input and leaves the counter unchanged. 

We further assume that nmax is the maximum number of samples that can occur 
within the maximum delay nmax = E(^) + 1, where E denotes the integer 
part function, and that the minimum stable interval of each component of the 
tuple exceeds {n.nmax + 1)Tm- We then can prove: 

Theorem 4. The confirm function outputs a delayed image of the original tuple: 
Vt, 0 < t — t'(t) < {n.nmax + 1)T : confirm {X nmax) {t) = X{t'{t)) 

This shows that incoherences due to variable delays can be changed into coherent 
delays. Once this coherent delay problem is solved, we can consider bounded 
delay voting. But we can note here an interesting by-product of this confirm 
function: 

Corollary 1. The output of a confirm function con firm{X, nmax) remains 
constant for time intervals of duration at least nmaxTm ■ 

4.3 Bounded Delay Voting 

Let us consider several copies of a boolean signal x, received by some unit of 
period T with a maximum normal delay such that + Tm < T^. Then, 

— the maximum time interval where two correct copies may continuously di- 
sagree is obviously r^,, 

— the maximum number of samples where two correct copies continuously di- 
sagree is nmax = E{^) + 1 

This allows us to design bounded-delay voters for bounded-delay booleans 
signals. For instance, a 2/2 voter could be: 

Definition 7 (2/2 bounded-delay voter). 

voter2 /2{xi, X 2 , nmax) = x 
where x, n = ±f X 2 = X\ 

thenxi, 0 

else if Af^n < nmax — 1 
then A^qX, A^n+l 
else alarm 

— this voter maintains a counter n with initial value 0, and its previous output, 
with some known initial value xO, 




Threshold and Bounded-Delay Voting in Critical Control Systems 77 

— whenever the two inputs agree , it outputs one input and resets the counter, 

— else, if the counter has not reached nmax — 1, it increments it and outputs 
the previous output, 

— else it raises an alarm. 



Theorem 5. voter2/2 raises an alarm if inputs disagree for more than nmaxTM 
otherwise it delivers the correct value with maximum delay {nmax + 1 )Tm- 



4.4 Bounded Delay Voting on Tuples 

Combining bounded delay voting on booleans and confirmation functions allows 
for the definition of voters for tuples and combinational functions: 

Definition 8 (2/2 bounded-delay voter for tuples). 

Voter2/2{Xi, X 2 ,nmax) = con f irm{X” , nmax) 
where Vi € {l,n} : x" = voter2/2{xi^i,X2,i,nmax) 

Then the following theorem yields assuming: 

— Xi,i= 1,2 are bounded delay images of the same X, 

— each component of X has minimum stable time interval 

— nmax = E{^) + 1 

— Tj, > {n.nmax + 1 )Tm 



Theorem 6. Voter2/2 raises an alarm if any corresponding components of the 
two inputs disagree for more than nmaxTM and otherwise delivers the correct 
value with maximum delay {n.nmax + 1)Tm- 

This is the combination of theorems 4 and 5, but for the fact that we do not 
need to propagate the -1-1 additional delay from the voter to the confirmation 
function, because both occur in the same computing unit. 



5 Sequential Functions 

For the sake of simplicity, we do not distinguish here between states and outputs. 
Hence, a sequential function T will be defined by its transition function F\ 
T{U) = X, where V = F{A^^o^, U) 



Definition 9. A l_stable sequential function is a sequential function whose 
state only changes when its input changes. For all state X and input U: 
F{F{X,U),U) = F{X,U) 




78 



P. Caspi and R. Salem 



5.1 Bounded-Delay Voting for Sequential Functions 

An idea for applying bounded-delay voting to sequential functions would be 
to transform sequential functions into combinational ones by setting apart state 
memorization and voting on states as well as on inputs, i.t. instead of computing 
Xi = F{A^QXi,vote{U)) compute Xi = F{vote{A^QX) , vote{U)) . 

This does not work in general, for Byzantine-like reasons: our bounded-delay 
voters are sequential systems in the sense that they store as state variable the 
last value on which units agreed; then a malignant unit may successively agree 
with states of correct units, while these units disagree for delay reasons, leading 
the state voters of these correct units to store incoherent values. 

5.2 2/2 Vote for Sequential Functions 

Quite surprisingly, this phenomenon does not seem to appear for 2/2 bounded 
delay vote. 2/2 voting is in general not sensitive to Byzantine behaviors because 
a malignant unit only communicates with a correct one: either it agrees with it 
and behaves like the correct one or it disagrees and the fault is detected. This 
property of avoiding Byzantine problems may explain why this fault-detection 
strategy is very popular. Let us consider the voting scheme: 

Vi = F(vote2/2(Z\3^oAi, A' , n{nmaxu + nmaxx)), 

confirm{ {vote2/2{u'i j, u '2 ^,nmaxu),i = 1, n}, 
nmaXu + nmaxx)) 

X 2 = F{vote2j2{A\^X2,X{ , n{nmaxu + nmaxx)), 

confirm{ {vote2/2{u'{ j, u '2 i,nmaxu),i = 1, n}, 
nmaXu + nmaxx)) 

where 

— F is a l_stable transition function. 

— U- and [/" are nTuples linked to some U by bounded delays less than r„ 
and nmaXu = E{^) + 

— A- are linked to Xi by bounded delays less than tx and nmaxx = E{^) + 1. 

— we assume tx < t„ 

~ the minimum stable interval of each component of U , is larger than 

{n.{nmaxu + nmaxx) + ^)Tm- 

Remark: we use here our simple boolean votes extended to tuples and not our 
tuple voters. The reasons are that, for states, we do not need confirmation, 
because one tuple comes from the very unit which computes the vote and the 
other one comes in one shot from the other unit, and for inputs, we set apart 
the confirmation because we need a longer stable delay. 

Theorem 7. In the absence of faults, any state voter delivers a correct state 
with maximum delay {nfnmaxu + nmaxx) + 1)Tm and delivers an alarm if 
some input components disagree for more than nmax^TM or one of the two 
units does not compute as specified for more than nfnmaXu + nmaxx)TM- 




Threshold and Bounded-Delay Voting in Critical Control Systems 



79 



Proof: The proof is by induction on an execution path: Initially, inputs and 
states agree. Assume a time instant where inputs and states agree. From the 
stability assumption, nothing will change unless inputs change. Assume some 
input changes. It takes at most (jimaXu+^)T for both units to agree on this input 
change and at most (nmaXx + l)T to agree on the corresponding state change. 
If this is the case we are done. Now it may be the case that meanwhile another 
input component changes. It the agreement on states has not been reached, no 
harm is done because, in both units, state voters still maintain the old state. If 
the agreement on states is reached in one unit, it will be certainly reached in the 
other one, because the “confirm” function freezes inputs for a sufficient delay 
for reaching an agreement in both units (this part of the proof is illustrated at 
figure 2). By induction on the number of components, if a state agreement has 
not been reached before, it will be reached when every input component have 
changed once, because no component can change again from the minimum stable 
interval assumption. Finally the alarm part of the theorem is obvious from the 
voting algorithm. 

X^F{X,U) Xi = F{X,Ui) Xi = F{Xi,Ui) 

b 



Xi=F{X,Ui) Ai=F(Ai,Fi) 



a 
















* Tu * 


* Tx * 



Fig. 2. Proof illustration: process a sees the input changing from U to Ui first and 
computes Xi. Then process b sees the same change and also compntes Xi. Having 
received Xi from a, its state voter moves to Xi but, for the sake of stability, it goes 
on computing Xi. Now this value will in turn reach a before the input changes again, 
thus allowing state voters to reach an agreement on Xi. 



5.3 Ftom Fault Detection to Fault Tolerance 

2/2 voters account for fault detection and for the design of self-checking modules. 
Now these modules can be combined by means of selective redundancy so as to 
form fault-tolerant modules. In the Airbus architecture, this goal is achieved 
thanks to a global system approach: two self-checking computing systems are 
provided, each one being able to control the aircraft by itself. 

Another possibility is to design hybrid redundancy voters: a primary self- 
checking system operates until it fails. Meanwhile a secondary system votes on 
the states of the primary one, so as to stay coherent with it. When the primary 
system fails, the control is moved to the secondary one which is hopefully still 
correct. 

When designing it, one has to solve the following problem: we switch from the 
primary system to the secondary one when the watchdog counting the number 




80 



P. Caspi and R. Salem 



of disagreements has expired. But then, it may be the case that, because of 
asynchrony, the secondary system is not yet coherent. We must not raise an 
alarm here, but, on the contrary leave this system the time to reach a coherent 
state. This is achieved by the following resetting 2/2 voter: 

Definition 10 ( 2/2 Resetting Voter). 

rvoter2/2{xi,X2,reset,nmax) = x 
where x = ±f X 2 = X\ 

then x\ 

else if Z\q n < nmax — 1 
then A'^qX 
else alarm 

and n = ±f X 2 = XiV reset 

then 0 

else if AqU < nmax — 1 
then AqH + 1 
else alarm 



We are now in a position to design the hybrid voter: 

Definition 11 ( 1/2 x 2/2 Voter). 



voterl/2 x 2/2{Xi, X 2 , X^, X 4 , nmax) = X 



where X 
andV;, V' 



and primary 



cuid reset 



= rvoter2/2{X[, X' 2 , reset, nmax) 
= if primary 
then Xi, X 2 
else V 3 , X 4 . 

= ±f X = alarm 

then false 
else primary 

= {-'primary) A {Aj).^^primary) 



— primary is initially true and the vote is performed on units 1 and 2 . 

— When one of these fails, primary becomes false for ever and the vote is 
performed on units 3 and 4. 



6 Conclusion 

Finally, it seems that we have been able to provide interesting fault tolerant 
schemes only based on the timing properties of periodic unsynchronized sy- 
stems. This was quite easy to do for continuous and combinational systems. The 
problem was more involved for sequential stable systems, but, nevertheless we 
have found a fault detection scheme which applies to this case and is still only 
based on timing properties. This allows self-checking dual modules to be build 
that can serve as building blocks for more elaborated fault tolerant strategies. 




Threshold and Bounded-Delay Voting in Critical Control Systems 



81 



One quite obvious possible use of this work is to help certification authorities 
in getting a better insight on these techniques which are still in use on many 
systems. Another possible one would be to help designers in choosing between 
clock synchronization techniques and the ones presented above: one outcome of 
this work is the computation of periods depending on some characteristics of 
the application, mainly the minimum stable time interval of the inputs, and the 
dimension of the involved tuples. This has clearly to be balanced with the cost 
of clock synchronization. 

In this setting, a question left open in this work is the one of unstable se- 
quential systems, for which our techniques clearly do not apply. But, this raises 
the question of when the implementation of critical control systems do require 
programming unstable systems. Our opinion is that, as for continuous control, 
this is seldom the case. More generally, the celebrated “Y2K” bug tells us that 
unstable systems should be avoided as most as possible. 

Last but not least, we hope to attract the attention of computer scientist on 
this kind of techniques which seem to have been somewhat neglected in the past. 
It is our opinion that there is by now some revival of the asynchronous circuit 
culture to which this work may participate. 



References 

1. M. Abadi and L. Lamport. An old-fashioned recipe for real time. ACM Transac- 
tions on Programming Languages and Systems, 16(5):1543-1571, 1994. 

2. S. Bensalem, P. Caspi, C. Dumas, and C. Parent-Vigouroux. A methodology for 
proving control programs with Lustre and PVS. In Dependable Computing for 
Critical Applications, DCCA-7, San Jose. IEEE Computer Society, January 1999. 

3. J.-L. Bergerand and E. Pilaud. Saga : A software development environment for 
dependability in automatic control. In IFAC-SAFECOMP’88. Pergamon Press, 
1988. 

4. D. Briere, D. Ribot, D. Pilaud, and J.L. Camus. Methods and specification tools 
for Airbus on-board systems. In Avionics Conference and Exhibition, London, 
December 1994. ERA Technology. 

5. P. Caspi and N. Halbwachs. A functional model for describing and reasoning about 
time behaviour of computing systems. Acta Informatica, 22:595-627, 1986. 

6. M.J. Fisher, N.A. Lynch, and M.S. Patterson. Impossibility of distributed consen- 
sus with one faulty processor. Journal of the ACM, 32(2):374-382, 1985. 

7. A.H. Hopkins, T. Basi Smith, and J.H. Lala. FTMP:a highly reliable fault-tolerant 
multiprocessor for aircraft. Proceedings of the IEEE, 66(10):1221-1239, 1978. 

8. H. Kopetz, A. Damm, Ch. Koza, M. Mulazzani, W. Schwabl, Ch. Senft, and 
R. Zainlinger. Distributed fault-tolerant real-time systems: the MARS approach. 
IEEE Miero, 9(l):25-40, 1989. 

9. M. Pease, R.E. Shostak, and L. Lamport. Reaching agreement in the presence of 
faults. Journal of the ACM, 27(2):228-237, 1980. 

10. J.H. Wensley, L. Lamport, J. Goldberg, M.W. Green, K.N. Lewitt, P.M. Melliar- 
Smith, R.E Shostak, and Ch.B. Weinstock. SIFT: Design and analysis of a fault- 
tolerant computer for aircraft control. Proeeedings of the IEEE, 66(10) :1240-1255, 
1978. 




Automating the Addition of Fault-Tolerance 



Sandeep S. Kulkarni 
Department of Computer 
Science and Engineering 
Michigan State University 
East Lansing MI 48824 USA 



Anish Arora 

Department of Computer 
and Information Science 
Ohio State University 
Columbus Ohio 43210 USA 



Abstract 

In this paper, we focus on automating the transformation of a given fault-intolerant 
program into a fault-tolerant program. We show how such a transformation can be 
done for three levels of fault-tolerance properties, failsafe, nonmasking and masking. 
For the high atomicity model where the program can read all the variables and write 
all the variables in one atomic step, we show that all three transformations can be 
performed in polynomial time in the state space of the fault-intolerant program. For 
the low atomicity model where restrictions are imposed on the ability of programs to 
read and write variables, we show that all three transformations can be performed in 
exponential time in the state space of the fault-intolerant program. We also show that 
the the problem of adding masking fault-tolerance is NP-hard and, hence, exponential 
complexity is inevitable unless P = NP. 



1 Introduction 



In this paper, we focus on automating the transformation of a fault-intolerant program 
into a fault-tolerant program. The motivations behind this work are multi-fold. The 
first motivation comes from the fact that the designer of a fault-tolerant program is 
often aware of a corresponding fault-intolerant program that is known to be correct 
in the absence of faults. Or, the designer may be able to develop a fault-intolerant 
program and its manual proof in a simple way. In these cases, it is expected that the 
designer will benefit from reusing that fault-intolerant program rather than starting 
from scratch. Moreover, the reuse of the fault-intolerant program will be virtually 
mandatory if the designer has only an incomplete specification and the computations 
of the fault-intolerant program is the de-facto specification. 

The second motivation is that the use of such automated transformation will obviate 
the need for manually constructing the proof of correctness of the synthesized fault- 
tolerant program as the synthesized program will be correct by construction. This 
advantage is especially useful when designing concurrent and fault-tolerant programs as 
it is well-understood that manually constructing proofs of correctness for such programs 
is especially hard. 

^ Email: sandeep@cse.msu.edu, anish@cis.ohio-state.edu. Web: http://www.cse.msu. 
edu/" sandeep, http : //www. cis . ohio-state . edu/"anish. Tel: +1-517-355-2387. 
Arora is currently on sabbatical leave at Microsoft Research. This work was par- 
tially sponsored by NSA Grant MDA904-96-1-0111, NSF Grant NSF-CCR-9972368, 
an Ameritech Faculty Fellowship, a grant from Microsoft Research, and a grant from 
Michigan State University. 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 82-93, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Large Automating the Addition of Fault-Tolerance 83 



The third motivation stems from our previous work [1,2] that shows that a fault- 
tolerant program can be expressed as a composition of a fault-intolerant program and a 
set of ‘fault-tolerance components’. The fault-intolerant program is responsible for en- 
suring that the fault-tolerant program works correctly in the absence of faults; it plays 
no role in dealing with fault-tolerance. The fault-tolerance components are responsible 
for ensuring that the fault-tolerant program deals with the faults in accordance to the 
level of tolerance desired; they play no role in ensuring that the program works cor- 
rectly in the absence of faults. We have also found that the fault-tolerance components 
help in manually designing fault-tolerant programs as well as in manually constructing 
their proofs [2]. Moreover, we have found that programs designed using fault-tolerance 
components are easier to understand and have a better structure [2] than programs 
designed from scratch. 

The third motivation suggests that given a fault-intolerant program p, we should 
focus on transforming it to obtain a fault-tolerant program p such that the transfor- 
mation is done solely for the purpose of dealing with faults according to the level of 
fault-tolerance desired. More specifically, it suggests that p' should not introduce new 
ways to satisfy the specification in the absence of faults. 

We study the problem of transforming a fault-intolerant program into a fault- 
tolerant program for three levels of fault-tolerance properties, namely, failsafe, non- 
masking and masking. Intuitively, a failsafe fault-tolerant program only satisfies the 
safety of its specification, a nonmasking fault-tolerant program recovers to a state 
from where its subsequent computation is in the specification, and a masking fault- 
tolerant program satisfies the specification even in the presence of faults. (See Section 
2 for precise definitions.) 

For each of the three levels of fault-tolerance properties, we study the transfor- 
mation problem in the context of two models; the high atomicity model and the low 
atomicity model. In the high atomicity model, the program can read and write all 
its variables in one atomic step. In the low atomicity model, the program consists of 
a set of processes, and the model specifies restrictions on the ability of processes to 
atomically read and write program variables. Thus, the transformation problem in the 
low atomicity model requires us to derive a fault-tolerant program that respects the 
restrictions imposed by the low atomicity model. 

The main contributions are as follows: (1) For the high atomicity model, we present 
a sound and complete algorithm that solves the transformation problem. The complex- 
ity of our algorithm is polynomial in the state space of the fault-intolerant program 
(cf Section 4). (2) For the low atomicity model, we present a sound and complete 
algorithm that solves the transformation problem. The complexity of our algorithm 
is exponential in the state space of the fault-intolerant program (cf Section 5.1). (3) 
We also show that for the low atomicity model, the problem of transforming a fault- 
intolerant program into a masking fault-tolerant program is NP-hard. It follows that 
there is no sound and complete polynomial algorithm to solve the problem of adding 
masking fault-tolerance unless P = NP (for reasons of space, we relegate the proof of 
NP-completeness to [3]). 

Organization of the paper. This paper is organized as follows: We provide the 
definitions of programs, specifications, faults and fault-tolerance in Section 2. Using 
these definitions, we state the transformation problem in Section 3. In Section 4, we 
show how the transformation problem is solved in the high atomicity model. In Section 
5, we show how to characterize the low atomicity model and sketch our algorithm for 
the low atomicity model. Finally, we discuss related work and concluding remarks in 
Section 6. (For reasons of space, we refer the reader to [3] for the proofs of correctness 
and the examples of programs constructed using our algorithms) 




84 S. S. Kulkami and A. Arora 



2 Programs, Specifications Faults, and Fault-Tolerance 

In this section, we give formal definitions of programs, problem specifications, faults, 
and fault-tolerance. The programs are specified in terms of their state space and their 
transitions. The definition of specifications is adapted from Alpern and Schneider [4]. 
And, the definition of faults and fault-tolerances is adapted from our previous work. 

2.1 Program 

Definition. A program p is a tuple (Sp,6p) where Sp is a finite set of states, and 6p 
is a subset of {(so, si) : so, si 6 5p}. □ 

Definition (State predicate). A state predicate of p{= (Sp, Sp)) is any subset of Sp. 

□ 

Notation. A state predicate S is true in state s iff s 6 5. 

Definition (Projection). Let p(= (Sp,Sp)) be a program, and let 5 be a state 
predicate of p. We define the projection of p on S, denoted as p|5, as the program 
(5p, {(so, si) : (so, si) 6<ip A so,si65}). □ 

Note that p\S consists of transitions of p that start in S and end in S. 

Definition (Subset). Let p(= (Sp,Sp)) and p'(= (Sp,Sp)) be programs. We say 
p' C p iff = Sp and Sp C Sp. □ 

Definition (Closure). A state predicate S is closed in a set of transitions Sp iff 
(V(so, si) : (so, si) 6<ip : (so6 5 Si 6 5)). □ 

Definition. (Computation). A sequence of states, (so,si,...), is a computation of 
p(= (Sp,Sp)) iff the following two conditions are satisfied: 

- Vj : j > 0 : (sj-i,Sj)6<ip, 

— if (so,si,...) is finite and terminates in state s; then there does not exist state s 

such that (si, s) 6 <ip. □ 



Notation. We call Sp as the transitions of p. When it is clear from context, we use p 
and Sp interchangeably, e.g., we say that a state predicate 5 is closed in p(= (Sp,Sp)) 
to mean that 5 is closed in Sp. 

2.2 Specification 

Definition. A specification is a set of infinite sequences of states that is suffix 
closed and fusion closed. Suffix closure of the set means that if a state sequence a is 
in that set then so are all the suffixes of a. Fusion closure of the set means that if 
state sequences a, x, 7 and /3, x, S are in that set then so are the state sequences 
a, X, S and /3, a:, 7 , where a and (3 are finite prefixes of state sequences, 7 and S are 
suffixes of state sequences, a: is a program state, and axj denotes a sequence obtained 
by concatenating a, x and 7 . □ 

Following Alpern and Schneider [4], it can be shown that any specification is the 
intersection of some “safety” specification that is suffix closed and fusion closed and 
some “liveness” specification. Intuitively, the safety specification identifies a set of bad 
prefixes. A sequence is in the safety specification iff none of its prefixes are identified 
as bad prefixes. Intuitively, a liveness specification requires that any finite sequence be 
extensible in order to satisfy that liveness specification. Formally, 

Definition (Safety). A safety specification is a set of state sequences that meets 
the following condition: for each state sequence a not in that set, there exists a prefix 
a of <T, such that for all state sequences (3, a(3 is not in that set □ 




Large Automating the Addition of Fault-Tolerance 85 



Definition {Liveness). A liveness specification is a set of state sequences that meets 
the following condition: for each finite state sequence a there exists a state sequence (i 
such that a(i is in that set. □ 

Notation. Let spec be a specification. We use the term ‘safety of spec’ to mean the 
smallest safety specification that includes spec. 

Note that the synthesis algorithm must be provided with a specification that is 
described in finite space. To simplify further presentation, however, we have defined 
specifications to contain infinite sequences of states. A concise representation of these 
infinite sequences is given in Section 2.6. 

2.3 Program Correctness with respect to a Specification 

Let spec be a specification. 

Definition (Refines), p refines spec from S iff (1) S is closed in p, and (2) Every 
computation of p that starts in a state where S is true is in spec. □ 

Definition (Maintains). Let a be a finite sequence of states. The prefix a maintains 
spec iff there exists a sequence of states (i such that a(i 6 spec. □ 

Notation. We say that p maintains spec from 5 iff 5 is closed in p and every com- 
putation prefix of p that starts in a state in S maintains spec. We say that p violates 
spec from S iff it is not the case that p refines spec from S. 

Definition (Invariant). S is an invariant of p for spec iff and p refines spec 

from S. □ 

Notation. Henceforth, whenever the specification is clear from the context, we will 
omit it; thus, “5 is an invariant of p” abbreviates “5 is an invariant of p for spec ”. 

2.4 Fanlts 

The faults that a program is subject to are systematically represented by transitions. 
We emphasize that such representation is possible notwithstanding the type of the 
faults (be they stuck-at, crash, fail-stop, omission, timing, performance, or Byzantine), 
the nature of the faults (be they permanent, transient, or intermittent), or the ability 
of the program to observe the effects of the faults (be they detectable or undetectable). 

Definition (Fault). A fault for p(= (Sp,Sp)) is a subset of {(so,si) : so,si€5p}. □ 

For the rest of the section, let .spec be a specification, T be a state predicate, S an 
invariant of p, and / a fault for p. 

Definition (Computation in the presence of faults). A sequence of states, 
(sojSi,...), is a computation of p(= (Sp,Sp)) in the presence of / iff the following 
three conditions are satisfied: 

— Vj : j > 0 : (sj-i , Sj) 6 (Sp U /), 

— if (sojSi,...) is finite and terminates in state si then there does not exist state s 
such that (si,.s)€Sp, and 

— 3n : n > 0 : (Vj : j > n : (sj-i, sj) €Sp). □ 

Notation. For brevity, we use ‘p[]/’ to mean ‘p in the presence of /’. More specifically, 
a sequence is a computation of ‘p[]/’ iff it is a computation of ‘p in the presence of /’. 
And, the transitions of p[]f are obtained by taking the union of the transitions of p 
and the transitions of /. 

Definition (Fault-span). A predicate T is an /-span of p from 5 iff 5 T and T 
is closed in p[]f. □ 

Thus, at each state where an invariant 5 of p is true, and an /-span T of p from S 
is also true. Also, T, like S, is also closed in p. Moreover, if any action in / is executed 




86 S. S. Kulkami and A. Arora 



in a state where T is true, the resulting state is also one where T is true. It follows that 
for all computations of p that start at states where S is true, T is a boundary in the 
state space of p up to which (but not beyond which) the state of p may be perturbed 
by the occurrence of the actions in /. 

Notation. Henceforth, whenever the program p is clear from the context, we will omit 
it; thus, “S is an invariant” abbreviates “S is an invariant of p” and “/ is a fault” 
abbreviates “/ is a fault for p” . 

2.5 Fault-Tolerance 

In the absence of faults, a program should refine its specification. In the presence of 
faults, however, it may refine a weaker version of the specification as determined by the 
level of tolerance provided. With this notion, we define three levels of fault-tolerance 
below. 

Definition (failsafe f -tolerant for spec from S). p is failsafe /-tolerant to spec 
from S iff (1) p refines spec from S, and (2) there exists T such that T is an /-span of 
p from S and p[]/ maintains spec from T. □ 

Definition (nonmasking f -tolerant for spec from S). p is nonmasking /-tolerant 
to spec from S iff (1) p refines spec from S, and (2) there exists T such that T is an 
/-span of p from S and every computation of p[]f that starts from a state in T has a 
state in S. □ 

Definition (masking f -tolerant for spec from S). p is masking /-tolerant to spec 
from S iff (1) p refines spec from S, and (2) there exists T such that T is an /-span of 
p from S, p\\f maintains spec from T, and every computation of p[]/ that starts from 
a state in T has a state in S. □ 

Notation. In the sequel, whenever the specification spec and the invariant S are clear 
from the context, we omit them; thus, “masking /-tolerant” abbreviates “masking 
/-tolerant for spec from 5”, and so on. 

2.6 Observations on Programs and Specifications 

In this section, we summarize observations about our programs and specifications. 
Subsequently, we present the form in which specifications are given to the synthesis 
algorithm. 

Note that a specification, say .spec, is a set of infinite sequences of states. If p refines 
spec from S then all computations of p that start from a state in S are in spec and, 
hence, all computations of p that start from a state in S must be infinite. Using the 
same argument, we make the following two observations. 

Observation 2.1 If p' is (failsafe, nonmasking or masking) /-tolerant for spec from 
S' then all computations of p' that start from a state in S' must be infinite. □ 

Observation 2.2 If p' is (nonmasking or masking) /-tolerant for spec from S' then 
all computations of p'[]f that start from a state in S' must be infinite. □ 

Observe that we do not disallow fixed-point computations; we simply require that if 
So is a fixed-point of p then the transition (so, so) should be included in the transitions 
of p. 

Concise Representation for Specifications. Recall that a safety specification 
identifies a set of bad prefixes that should not occur in program computations. For 
fusion closed and suffix closed specifications, we can focus on only prefixes of length 2. 
In other words, if we have a prefix (a, so) that maintains spec then we can determine 
whether an extended prefix (a,so,si) maintains spec by focusing on the transition 
(so,si), and ignoring a. Formally we state this in Lemma 2.3 as follows (cf [2] for 
proof.): 




Large Automating the Addition of Fault-Tolerance 87 



Lemma 2.3. Let a be finite sequence of states, and let spec be a specification. 

If (a, So) maintains spec 

Then (a, so,si) maintains spec iff (so,si) maintains spec. □ 

From Lemma 2.3, it follows that the safety specification can be concisely represented 
by the set of ‘bad transitions’. For simplicity, we assume that for a given spec and a state 
space 5p, the set of bad transitions corresponding to the minimal safety specification 
that includes spec are given. If this is not the case and spec is given in terms of a 
temporal logic formula, the set of bad transitions can be computed in polynomial time 
by considering all transitions (so,si), where so,si€5p. 

Our proof that a fault-tolerant program refines the liveness specification solely 
depends on the fact that the fault-intolerant program refines the liveness specification. 
Therefore, our algorithm can transform a fault-intolerant program into a fault-tolerant 
program even if the liveness specification is unavailable. 

3 Problem Statement 

In this section, we formally specify the problem of deriving a fault-tolerant program 
from a fault-intolerant program. We first intuitively characterize what it means for a 
fault-tolerant program p to be derived from a fault-intolerant program p. We use this 
characterization to precisely state the transformation problem. Finally, we also discuss 
the soundness and completeness issues in the context of the transformation problem. 

Now, we consider what it means for a fault-tolerant program p to be derived from 
p. As mentioned in the introduction, our derivation is based on the premise that p is 
obtained by adding fault-tolerance alone to p, i.e., p does not introduce new ways of 
refining .spec when no faults have occurred. We precisely state this concept based on 
the following two observations: (1) If S' contains states that are not in S then, in the 
absence of faults, p will include computations that start outside S. Since p refines 
spec from S' , it would imply that p is using a new way to refine spec in the absence of 
faults (since p refines spec only from S). Therefore, we require that S' S (equivalently 
S' =>- S). (2) If p'\S' contains a transition that is not in p\S' , p can use this transition 
in order to refine spec in the absence of faults. Since this was not permitted in p, we 
require that p'\S' C p\S' . Thus, we define the transformation problem as follows (This 
definition will be instantiated for failsafe, nonmasking and masking /-tolerance): 

The Transformation Problem 
Given p, S, .spec and / such that p refines .spec from S 
Identify p' and S' such that 
S' ^ S, 

p'\S' C p\S' , and 
p is /-tolerant to .spec from S' . 



We also define the corresponding decision problem as follows: (This definition will 
also be instantiated for failsafe /-tolerance, nonmasking /-tolerance and masking /- 
tolerance) : 

The Decision Problem 

Given p, S, .spec and / such that p refines .spec from S 
Does there exist p' and S' such that 
S' ^ S, 

p'\S' C p\S' , and 

p' is /-tolerant to .spec from S'l 






88 S. S. Kulkami and A. Arora 



Notations. Given a fault-intolerant program p, specification spec, invariant S and 
faults /, we say that program p' and predicate S' solve the transformation problem 
for a given input iff p and S' satisfy the three conditions of the transformation prob- 
lem. We say p' (respectively S') solves the transformation problem iff there exists S' 
(respectively p) such that p',S' solve the transformation problem. 

Soundness and completeness. An algorithm for the transformation problem is 
sound iff for any given input, its output, namely program p' and the state predicate 
S' , solves the transformation problem. An algorithm for the transformation problem 
is complete iff for any given input if the answer to the decision problem is affirmative 
then the algorithm always finds program p' and state predicate S' . 

4 Adding Fault- Tolerance in High Atomicity Model 

In this section, we consider the transformation problem for programs in the high atom- 
icity model, where a program transition can read any number of variables as well as 
update any number of variables in one atomic step. In other words, if the enumerated 
states of the program are so, si, ...Smax then the program transitions can be any subset 
of {(sj,SA;) : 0< j <max}. We present our algorithm for adding failsafe, nonmasking 
and masking fault-tolerance in Sections 4.1, 4.2, and 4.3 respectively. 

4.1 Problem of Designing Failsafe Tolerance 

As shown in Section 2, the safety specification identifies a set of bad transitions that 
should not occur in program computations. Given a bad transition (so, si), we consider 
two cases: (1) (so,si) is not a transition of /, (2) (so,si) is a transition of /. 

For case (1), we claim that (so,si) can be removed while obtaining p . To see this 
consider two subcases: (a) state so is ever reached in the computation of p'[]f, and 
(b) state So is never reached in the computation of p'[]f. In the former subcase, the 
transition (so,si) must be removed as the safety of spec can be violated if p'\\f ever 
reaches state so and executes the transition (so, si). In the latter subcase, the transition 
(so,si) is irrelevant and, hence, can be removed. 

For case (2), we cannot remove the transition (so,si) as it would mean removing 
a fault transition. Therefore, we must ensure that p'[]f never reaches the state so. In 
other words, for all states s, the transition (s, so) must be removed in obtaining p' . 
Also, if any of these removed transitions, say (sq, so), is a fault transition then we must 
recursively remove all transitions of the form (s, Sq) for each state s. 

Using the above two cases, our algorithm to obtain the failsafe fault-tolerant pro- 
gram is as follows: it first identifies states, ms, from where execution of one or more 
fault transitions violates safety. Then, it removes transitions, mt, of p that reach these 
states as well as transitions of p that violate the safety of spec. (The latter part is 
included as transitions of p may violate the safety of .spec in states outside S.) If there 
exist states in the invariant such that execution of one or more fault actions from those 
states violates the safety of .spec, then we recalculate the invariant by removing those 
states. In this recalculation, we ensure that all computations of p—mt within the new 
invariant. S' , are infinite. In other words, the new invariant is the largest subset of 
S—ms such that all computations of p—mt when restricted to that subset are infinite. 
Thus, the detailed algorithm. Add-fail. safe, is as shown in Figure 1. (As mentioned in 
Section 2, we use a program and its transitions interchangeably.): 

4.2 Problem of Designing Nonmasking Tolerance 

To design a nonmasking /-tolerant program p' , we ensure that from any state p even- 
tually recovers to a state in S. Thus, the detailed algorithm, Addjnonmasking, is as 
shown in Figure 1. (Note that the function RemoveCycles is defined in such a way that 
from each state outside S there is a path that reaches a state in S, and there are no 
cycles in states outside S.) 




Large Automating the Addition of Fault-Tolerance 89 



Add_failsafe(p, / : transitions, S : state predicate, spec : specification) 

{ 

ms := {so : 3si,S2,...s„ : (Vj : 0<j<n : (sj,S(j+i)) 6 /) A 
(s(„_i),s„) violates spec }; 

mf := {(so, si) : ((si€ms) V (so, si) violates spec) }; 

S' := Constructlnvariant(5 — ms, p—mf); 

if (5^ = {}) declare no failsafe /-tolerant program p' exists; 
else p' :=ConstructTransitions(p— mf. S') 

} 

Add_nonmasking(p, / : transitions, S : state predicate, spec : specification) 

{ 

RemoveCycles(5, true, (p|5) U {(so, si) : so ^S A si ^5 } 

} 

Add_masking(p, / : transitions, S : state predicate, spec : specification) 

{ 

Define ms and mf as in Add- fail safe. 

Si,Ti := Constructlnvariant(5 — ms, p—mf), true — ms\ 
repeat 

T2,52 :=Ti,5i; 

pi := p|5i U {(so, si) : So A so6Ti A si6Ti} — mf; 

Ti := ConstructFaultSpan(Ti — {s : 5i is not reachable from s in pi }, /); 

5i := Constructlnvariant(5i ATi,pi); 

if (5i = {) V 7i={}) declare no masking /-tolerant program p exists; 
until (Ti=T 2 a 5i = 52); 
p',S',T' := RemoveCycles(5i,Ti,pi),5i,Ti 

} 

Constructlnvariant(5 : state predicate, p : transitions) 

/ / Returns the largest subset of S from where all computations of p are infinite 

{ { while (3so : So 6 5 : (Vsi : si 6 5 : (so, si) ^p)) S := S — {so} }; return S } 

ConstructTransitions(p : transitions, S : set of states) 

{ return p— {(so, si) : So 6 5 A si ^ 5} } 

ConstructFaultSpan(T : state predicate, / : transitions) 

// Returns the largest subset of T that is closed in /. 

{ { while (3so, Si : So 6T A Si A (so, si) 6 /) T := T — {so} }; return T } 

RemoveCycles(5, T : state predicates, p : program) 

// Requires (Vso : so€ T : 5 is reachable from so in p) 

// Returns pi such that pi C p, pi|5=p|5, pi|(T— 5) is acyclic, and 
// (Vso : so€ T : 5 is reachable from so in pi). 

( Since several implementations are possible and any one of them is acceptable, 
we let this procedure be non-deterministic in order to let the designer determinize 
it to obtain the best efficiency as well as to satisfy other constraints, e.g., further 
transformation to add tolerance to new faults. 

One possible implementation in polynomial time is where each state is ranked 
based upon the shortest path from that state to a state in S, and transitions that 
increase the rank are removed.) 



Fig. 1. Addition of Fault-Tolerance in High Atomicity 





90 S. S. Kulkami and A. Arora 



4.3 Problem of Designing Masking Tolerance 

To design a masking /-tolerant program p', we proceed to identify the weakest invariant 
S' (which is stronger than S) and the weakest fault-span T' . To identify the first 
estimate for the invariant, S' , we proceed as in the case of failsafe fault-tolerance. More 
specifically, we first compute states and transitions in S that need to be removed. Then, 
we recalculate the invariant to ensure that all computations within S' are infinite. We 
estimate T' to be Ti where Ti = true — ms, i.e., Ti includes all states except those in 



We continue to strengthen our 5i and Ti while ensuring that if some S' solves the 
transformation problem then S' Si. We first identify and remove states in Ti from 
where it is not possible to reach a state in 5i without violating the safety of spec. We 
then find the largest subset of the remaining states that is closed in /. This represents 
the new estimate for fault-span. Since 5i must be a subset of Ti , we recalculate 5i to 
be the largest subset of 5i A Ti such that all the computations from that subset are 
infinite. We continue this process until we reach a fixpoint. Now, pi is such that from 
every state in Ti there is a path to a state in 5i . pi may, however, contain cycles that 
are entirely in Ti—Si . The function RemoveCycles removes the cycles while maintaining 
reachability. Thus, the detailed algorithm, Addjmasking, is as shown in Figure 1. 

While we leave the proof of soundness and completeness of algorithms Add-failsafe, 
Add-nonmasking and Add-masking to [3], we note that 

Theorem 4.1 The algorithms Add-nonmasking, Add-nonmasking and Add-masking 
are sound, complete, and in P. □ 



5 Adding Fault-Tolerance in Low Atomicity Model 

The synthesis algorithm in Section 4 assumes that the fault-tolerant program can 
contain a transition (so, si) for any two states so, si. If we think of the program state 
to consist of variables and their corresponding values, the synthesis algorithm assumes 
that the program can read the values of all variables and write the values of all variables 
in an atomic step. In this section, we first describe how a low atomicity model that 
imposes restrictions on how processes can read and write variables. Then, we will 
outline our algorithm in Section 5.1 

We assume that the program consists of processes; each process can atomically 
read a subset of the program variables and write (a possibly different) set of variables. 
To systematically use these restrictions imposed by the model, we now define what it 
means for a process to read and write a variable. First, we define the following two 
notations. 

Notation. Let a: be a variable. a:(so) denotes the value of variable x in state sq. 

Notation. Let rj denote the set of variables j is allowed to read and wj denote the set 
of variables that j is allowed to write. 

For simplicity, we assume that j can atomically read all variables in rj and write 
all variables in Wj. If this is not the case, we split process j into multiple processes 
that satisfy this assumption. We leave it to the reader to verify that this can always 
be done. 

Remark. Note that the above restrictions are for the program actions only. Faults are 
not restricted in any way, i.e., a fault transition could read and write all the variables 
in one atomic step. 

Write-restrictions. If j can only write the subset of variables Wj and the value of a 
variable other than that in wj is changed in the transition (so, si) then that transition 
cannot be used in synthesizing the transitions of j. In other words, being able to write 




Large Automating the Addition of Fault-Tolerance 9 1 



the subset Wj is equivalent to providing a set of transitions write{j,Wj) that j cannot 
use synthesis algorithm, where 

write{j,Wj) = {(so,si) : (3a; : x^Wj : a;(so) y^a;(si))} 

Read-restrictions. Initially, we consider the case where Wj Crj, i.e., j can write 
a variable only if it can read it. Let (so,si) be some transition of process j such that 
soy^si- Now, consider a state Sq such that the values of all variables in rj are identical 
to that in so- Since j can only read variables in rj, j must have a transition of the 
form (s'q, s'l). Moreover, the values of variables in rj in must be the same as that in 
si. And, since wj Crj, the values of variables that are not in rj must be the same as 
that in s'q. Considering all states where the values of rj are same, we get a group of 
transitions; if (sq, si) is a transition of j then all transitions in that group must also be 
transitions of j. We define these transitions as group{j,rj){so, si), for the case where 
Wj C r j , where 

group{j,rj){so,si) = {(s'q,s'i) : {\/x : x€rj : a;(so) =a;(s'o) A a;(si) =a;(s'i)) A 

(Va; : x^ rj : a;(s'o) = x{s'i) A a;(so) = a;(si)) } 

Now, we consider the case where Wj 2 rj, i.e., j writes variables without reading 
them. To motivate such cases, consider the following scenario: Let charij denote the 
sequence of messages on channel chan which is an outgoing channel from process j. 
When j sends a message, it writes chauj . However, j cannot read what messages are 
still pending on channel chan, i.e., j cannot read chanj. When j updates chanj, the 
new value of chanj depends upon the initial state of the program (including the initial 
value of chanj). In other words, there exists a function f chanj such that when j executes 
in state sq, j assigns the value f chanj (so) to chanj. 

More generally, if j can write multiple variables, say xi,X2, ..., without being able to 
read any of them, the model provides a function / (or polynomial number of different 
functions) such that when j executes in state sq, j assigns the value Xi{f {.so)) to variable 
Xi. Using / (or for each possible function /), we now define a group of transitions, 
group{j,f,rj){so,si), where 

group{j,f,rj){so,si) = {(s'o,s'i) : {Vx : x€rj : a;(so) = a;(s'o) A a;(si) =a;(s'i)) A 

{'ix : x^ rj : a;(s'i) = a;(/(s'o)) A a;(si) =a;(/(so))) } 

Remark. The above grouping is done for the case where the transition is not a self- 
loop. Regarding the self-loop, there are no restrictions. We model this by introducing a 
group (sq. So) for each state sq. Note, however, given a program p with invariant S, the 
masking (respectively, nonmasking) fault-tolerant program p can contain a self-loop 
only if it is in p|5. 

Combining read-restrictions and write-restrictions. The inability of a process 
to read is characterized in terms of grouping of transitions. Thus, if a transition in 
some group violates the restrictions imposed by the inability to write, then that entire 
group must be excluded in the design of fault-tolerant program. It follows that after 
combining the read restrictions and the write-restrictions, we get another grouping of 
transitions; we need to choose zero or more such groups to obtain the transitions of 
that process. Moreover, the time to compute these groups is polynomial in the size of 
the input. Thus, we have 

Observation 5.1 The groups of transitions corresponding to the given fault-intolerant 
program and the low atomicity model describing the processes (with the restriction on 
their ability to read and write) can be computed in polynomial time. □ 

5.1 Algorithm Sketch 

Now, we sketch our algorithm for adding fault-tolerance in the low atomicity model. 
Our algorithm is in NP and, hence, the complexity of the corresponding (brute-force) 




92 S. S. Kulkami and A. Arora 



deterministic algorithm is at most exponential. Being in NP, we simply guess the solu- 
tion, namely, the invariant S' , the fault-span T', and the groups of transitions which 
would be included in the fault-tolerant program p' . Subsequently, we verify that the 
three conditions of the transformation problem are satisfied. In this verification, the 
first two conditions, closure of S' in p' and closure of T' in p'[]f can be verified easily 
in polynomial time. The third condition about /-tolerance is verified by using T' as 
the fault-span. For failsafe and masking transformation, safety is verified by ensuring 
that p'\T' does not contain transitions in mt (as defined in Add-failsafe in Figure 1). 
For nonmasking and masking transformation, convergence to S' is verified by checking 
(1) there is an outgoing edge from each state in T' and (2) p'\{T' —S') is acyclic. (For 
reasons of space, we relegate the detailed algorithm to [3].) 

5.2 NP-completeness of Adding Masking Fanlt-Tolerance 

To show that the problem of adding masking fault-tolerance is NP-complete, we reduce 
the problem of 3-S AT to that of adding masking fault-tolerance. Given a 3-S AT problem 
consisting of literals ai,...On (and respective complements a'l, ...,a'n), we construct a 
graph where there are three vertexes, ai,bi,Si, for each a* and one vertex for each 
clause Cj. (The vertices in this graph denote the program states and edges denote 
the program transitions.) We define faults in such a way that each of these vertices is 
reachable in the presence of faults. We then select processes and variables such that the 
edges (6j, Oj), (a*, 6') and (6' , s') are grouped. Also, the graph contains edges from each 
clause Ci to each literal in that clause. The invariant of the fault-intolerant program 
consists of the s* (and s') states. From the possible permitted transitions, the program 
must first reach the vertex corresponding to some a* (or a(), then reach and then s(. 
Due to grouping constraints, the edge (6*, a*) must also be included in the program. 
Observe that the truth value assigned to a* determines whether the masking fault- 
tolerant program converges via a* or a(. (Note that the program cannot converge via 
both ai and a( as it would imply that there would be a cycle outside the invariant.) 
Thus, we construct an instance of the problem of adding masking fault-tolerance that 
has a solution iff the 3-SAT formula is satisfiable. For reasons of space, the detailed 
proof is in [3]. 



6 Conclusion and Future Work 

In this paper, we focused on the problem of adding fault-tolerance to a fault-intolerant 
program for three levels of fault-tolerance, namely failsafe, nonmasking and masking. 
We showed that these transformations are feasible and their complexity depends upon 
underlying system model. More precisely, the complexity was polynomial in a model 
where a process could read and write all variables, and it was exponential for the case 
where restrictions were imposed on ability of processes to read and write. We also 
argued that there are system models for which complexiy of adding masking fault- 
tolerance will be exponential unless P = NP. 

focused on transforming a fault-intolerant program into a fault-tolerant program. 
We considered three levels of fault-tolerance, namely failsafe, nonmasking and masking. 
We showed that in the high atomicity model, where the program can read and write 
all the variables in one atomic step, all these transformations can be performed in 
polynomial time in the size of the fault-intolerant program. We also showed that in the 
low atomicity model, where the program consists of processes each of which can only 
read and write a limited set of variables, all these transformations can be performed 
in exponential time in the size of the fault-intolerant program. For reasons of space, 
discussion about examples of programs that can be designed using these algorithms, 
namely, triple modular redundancy, byzantine agreemnt and token ring circulation, 
and the proof showing that the problem of adding masking fault-tolerance to a given 
fault-intolerant program is NF-hard is relegated to [3]. 




Large Automating the Addition of Fault-Tolerance 93 



The main difference between our work and the previous work on program synthesis 
[5-10] is that we begin with a fault-intolerant program and transform it to obtain fault- 
tolerance. By way of contrast, algorithms in [5-10] deal with synthesizing a program 
from its specification (typically in a temporal logic). For this reason, we believe that 
our approach will be especially useful if a fault-intolerant program is already known or 
if other constraints (such as unavailability of a complete specification of the given fault- 
intolerant program) require that we reuse the fault-intolerant program. Also, due to 
the same reason, our algorithms only needed the safety specification that the program 
is supposed to satisfy in the presence of faults; the algorithms did not need the liveness 
specification. 

Another difference between our work and previous work on synthesizing fault- 
tolerant programs [7-10] is the generality of our fault-model and that of the low atom- 
icity model. Specifically, our low atomicity model is more general than the Read/Write 
model considered elsewhere [9, 11]. For example, our low atomicity model includes 
common shared memory models where process can atomicity read its neighbors’ state 
and write its own state. This ability to design programs of atmocity higher than 
the Read/Write atomicity will be especially useful when adding fault-tolerance in 
Read/Write atomicity is impossible and adding it in higher atomicity is possible. 

Our work on transformation raises the following open questions: Do there exist 
system models which are stronger than the high atomicity model but weaker than the 
(general) low atomicity model for which polynonmial transformations are possible? Do 
there exist specific fault-models for which polynomial transformation is possible? We 
will address these questions in the future work. 

References 

1. A. Arora and S. S. Kulkarni. Detectors and correctors: A theory of fault-tolerance 
components. International Conference on Distributed Computing Systems, pages 
436-443, May 1998. 

2. S. S. Kulkarni. Component-based design of fault-tolerance. PhD thesis, Ohio State 
University, 1999. 

3. Sandeep S. Kulkarni and Anish Arora. Automating the addition of fault-tolerance. 
Technical Report MSU-CSE-00-13, Computer Science and Engineering, Michigan 
State University, East Lansing, Michigan, June 2000. 

4. B. Alpern and F. B. Schneider. Defining liveness. Information Processing Letters, 
21:181-185, 1985. 

5. E. A. Emerson and E. M. Clarke. Using branching time temporal logic to syn- 
chronize synchronization skeletons. Science of Computer Programming, 2:241-266, 
1982. 

6. Z. Manna and P. Wolper. Synthesis of communicating processes from temporal 
logic specifications. ACM Transactions on Programming Languages and Systems, 
6:68-93, 1984. 

7. A. Pnueli and R. Rosner. On the synthesis of a reactive module. ACM Symposium 
on Principles of Programming Languages, pages 179-190, 1989. 

8. A. Anuchitanukul and Z. Manna. Reliability and synthesis of reactive modules. 
International Conference on Computer-Aided Verification, pages 156-169, 1994. 

9. A. Arora, P. C. Attic, and E. A. Emerson. Synthesis of fault-tolerant concurrent 
programs. Proceedings of the 1 7th A CM Symposium on Principles of Distributed 
Computing (PODC), 1998. 

10. O. Kupferman and M. Vardi. Synthesis with incomplete information. ICTL, 1997. 

11. D. Dill and H. Wong-Toi. Synthesizing processes and schedulers from temporal 
specifications. International Conference on Computer-Aided Verification, 1990. 




Reliability Modelling of Time-Critical 
Distributed Systems 



Hans Hansson, Christer Norstrom, and Sasikumar Punnekkat 



Malardalen Real-Time Research Centre, 
Department of Computer Engineering, 
Malardalen University, Vasteras, SWEDEN, 
han@idt.mdh.se, cen@mdh.se, spt@idt.mdh.se, 
WWW home page: http : / /www . mrtc . mdh . se 



Abstract. In cost conscious industries, such as automotive, it is impe- 
rative for designers to adhere to policies that reduce system resources 
to the extent feasible, even for safety-critical sub-systems. However, the 
overall reliability requirement, typically in the order of 10“® faults/hour, 
must be both analysable and met. Faults can be hardware, software or 
timing faults. The latter being handled by hard-real time schedulability 
analysis, which is used to prove that no timing violations will occur. Ho- 
wever, from a reliability and cost perspective there is a tradeoff between 
timing guarantees, the level of hardware and software faults, and the 
per-unit cost for meeting the overall reliability requirement. 

This paper outlines a reliability analysis method that considers the ef- 
fect of faults on schedulability analysis and its impact on the reliability 
estimation of the system. The ideas have general applicability, but the 
method has been developed with modeling of external interferences of 
automotive CAN buses in mind. We illustrate the method using the ex- 
ample of a distributed braking system. 



1 Introduction 

The parallel evolution of fault tolerance and real-time realms of research, though 
have been greatly successful independently, still fail to bring the necessary syn- 
ergy between the two fields which both are of extreme importance in the design 
of safety-critical systems. Their mutual dependencies and interactions need to be 
analysed carefully for achieving predictable performance. The major stumbling 
block in having an intergrated approach is the orthogonal nature of two fac- 
tors, viz., the stochastic nature of faults and the deterministic requirements on 
schedulability analysis. This calls for development of more realistic fault models 
which capture the nuances of the environment as well as methods for easy inte- 
gration of such models into the timing analysis and finally, a unified and ‘formal’ 
approach in using them to obtain refined estimates for the system reliability. 

During the last decade, schedulability analysis of real-time systems has deve- 
loped into a mature discipline for determining whether a set of tasks executing 
on a single CPU or in a distributed system will meet their deadlines or not 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 94-105, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Reliability Modelling of Time-Critical Distributed Systems 



95 



[2][1][6] [10]. The essence of the analysis is to investigate if the deadlines are 
met in a worst case scenario. Whether this worst case actually will occur du- 
ring execution, or if it is likely to occur, is not normally considered (an exception 
being [3]). Reliability modeling, on the other hand involves study of fault models, 
characterization of distribution functions of faults and development of methods 
and tools for composing these distributions and models in estimating an overall 
reliability figure for the system. 

We have recently [5] developed a model for calculating worst-case latencies 
of messages under error assumptions, for the Controller Area Network (CAN). 
This analysis might infer that a given message set is not feasible under worst 
case fault interferences. Such a result though correct, is only of limited help to 
system designers except to prompt them to overdesign the system and waste 
resources to tackle a situation, which might never happen during the life time 
of the system. 

When performing schedulability analysis it is important to keep in mind that 
the analysis is only valid under some specific model assumptions. Behaviours ou- 
tside these assumptions are typically catered for in the reliability analysis. This 
separation of deterministic (0/1) schedulability analysis and stochastic reliabi- 
lity analysis is a natural simplification of the the total analysis, which might be 
pessimistic. Consider, for instance, occasional external interference on a commu- 
nication link. The effect will be increased message latencies which may lead to 
missed deadlines, especially if the interference coincides with the worst case mes- 
sage transmission scenario considered when performing schedulability analysis. 
In other scenarios, the interference might not increase the worst case message 
latency, as illustrated in Figure 1. The figure shows a system with 3 periodic 
messages Mi , M2 and M3 with descending priorities and with periods (equals to 
deadlines) of 5, 10 and 20 and worst-case transmission times of 2,1 and 1 respec- 
tively. Assuming an overhead, 0 = 1, for error signaling and recovery (but not 
including retransmission of the corrupted message), we have shown the effects 
of 3 different scenarios, corresponding to an external interference hitting the sy- 
stem at different points in time. In the first case, both M2 and M3 miss their 
deadlines. In the second case, though a re-transmission is necessitated, still the 
message set meets its deadlines, whereas in the third scenario, the error has no 
effect at all since it falls in a period of inactivity of the bus. Hence, schedulability 
analysis which only considers the worst-case phasing introduces pessimism by 
asuming that any interference will lead to a missed deadline. 

The basic argument of our work, is that a system can only be guaranteed 
up to some level, after which we must resort to reliability analysis, and that the 
reliability analysis can be made more accurate if it considers schedulability. In 
this paper we present: 

- An approach for integrating schedulability analysis and reliability models 

- A systematic procedure for obtaining more accurate reliability estimates 

- Modified response time modelling for CAN messages under our fault model 

- An illustrative example presenting the usefulness of this method 




96 



H. Hansson, C. Norstrom, and Sasikumar Punnekkat 



Mj Mj Mj M] M| Mj Mj 




ElTOt 



Sceoano-2 

Scbedulable 




Scenario-3 
No efleci 




20 



20 



20 



20 



Fig. 1. Dependency of Effects of Faults on Phasings 



The outline of the paper is as follows. Section 2 presents general reliability 
modelling for distributed real-time systems and introduces our approach. Sec- 
tion 3 specifically discusses the scheduling of message sets in Controller Area 
Networks under a general fault model and subsequently extends it to analyse 
arbitrary samples of phasings and interferences. Section 4 presents a case study 
of messages in a distributed computer network used in passenger cars. The con- 
cluding section 5, discusses some possible extensions. 

2 Reliability Modelling 

Reliability is defined as the probability that a system can perform its intended 
function, under given conditions, for a given time interval. In the context of an 
Antilock Braking System (ABS) for automobiles, this boils down to performing 
the tasks (mainly input _sensors, compute_control, and output_actuators etc.,) 
as per the specifications. Being part of a real-time system, the specifications 
for ABS, imply the necessity for the results to be both functionally correct and 
within timing specifications. A major issue here is how to compose hardware 
reliability, software reliability, environment model, and timing correctness to 
arrive at reasonable estimates of overall system reliability. Let us define 

PHpit) = Probability {Hardware failure at t) (1) 

PsF{t) = Probability{Software failure at t) (2) 

PcF{t) = Probability {Communication failure at f) (3) 

The reliability of the system, i?(t), is the probability that the system performs 
all its intended functions correctly for a period t. This is given by the product 
of cumulative probabilities that there are no failures in hardware, software and 
communication subsystem during the period (0,t). That is, 

R{t) = ^ PHF{t)^ Jo Jo 

In this paper, we concentrate only on the final term in Equation 4, i.e, the 
probability that no errors occur in the communication subsystem. Please note 



Reliability Modelling of Time-Critical Distributed Systems 



97 



that, when we talk about communication subsystem, we are not merely con- 
centrating on the faults in the hardware or software of such a system. Instead 
we consider the probability of correct and timely delivery of message sets. Since 
the main cause for an incorrect (corrupted, missing or delayed) message deli- 
very is environmental interferences, an appropriate modelling of such factors is 
essential. 

The basis for our modelling and analysis will be an appropriate environment 
model and method to perform response time analysis of CAN messages under 
normal and error conditions [5] , together with subsystem reliability requirements 
and timing specifications for the set of tasks and messages implementing the 
subsystem. These subsystem specifications and requirements are derived from 
overall system reliabilty and timing requirements. 

The problem analysed is, given the above information, how to find a suitable 
way of predicting the reliability of the communication subsystem. The simple 
approach will be to give a 0/1 weight to the schedulability aspect in evaluating 
the system ‘correctness’ and calculate the reliability. However, the environment 
model provides worst case scenarios, which may not occur in practice and its 
impact may depend on the actual phasing of messages and the way in which 
they interact with the environment /fault model. So, we are faced with additional 
questions, such as: 

- Can we partition the schedulability analysis under faults by considering a set 
of scenarios corresponding to message and fault model phasings and isolate 
the worst case as only one of several scenarios? 

- Is it possible to get a more accurate reliability estimate by such an analysis? 

2.1 Reliability Estimation 

By definition, reliability is specified for a mission time. Normally we can assume 
a repetitive pattern of messages (over the least common multiple (LCM) of the 
message periods). Since such an LCM is typically a very small fraction of the 
mission time, it is sufficient to find the impact of environmental interferences 
over some constant k cycles of LCM, where 1 < fc < . We can 

then extrapolate that data to get the projection over the entire mission time. 
The suitable value for k is one for which, k x LCM is large enough to contain 
any of the interference patterns, so that they do not spill over LCM boundaries. 

Here we concentrate on Pc pit) and outline a methodology for estimating it. 
Let t represents an arbitrary time point in (0, k x LCM) that marks the time 
instance when the external interference hits the bus and causes an error. If we 
can assume zero error latency and instantaneous error detection then t becomes 
the time point of detection of an error in the bus. We define, 

Pi{t) = Probability{Inter ference at t) (5) 

Prsit) = Probability {Deadline miss \ Interference at t ) (6) 

By relying on the extensive error detection and handling features available in 
CAN, we can safely assume that an error in message corruption is either detected 




98 



H. Hansson, C. Norstrom, and Sasikumar Punnekkat 



and corrected by re-transmission or will ultimately result in a timing error. So, 
the probability of communication failure due to interference starting at t is: 

PcF{i) = PTsit) X Pi{t) (7) 

In our environment model [5], we have assumed the possibility of an interfe- 
rence I\ from source 1, having a certain pattern hitting the message transmission. 
Let P}{t) be the probability of such an event occurring at time t. We also assume 
that another interference I 2 from source 2, having a different pattern, can hit 
the system at time t with a probability, say, Pj{t). In [5], we assumed both these 
interferences hitting the message transmission resulting in the worst case impact 
on schedulability. In this paper, we will increase the realism in this modeling by 
relaxing the requirement on the phasing between schedule and interference. Note 
that, there is an implicit assumption that these interferences are independent. 

2.2 Approach to Analysis 

To calculate the subsystem reliability, first we need to calculate the failure pro- 
bability, i.e. the probability of at least one failure (defined as a missed deadline) 
during the mission time, we use the following method: 

1. We assume that the interference free system is schedulable, i.e. it meets all 
deadlines with probability 1. 

2. For each interference source, we calculate the probability of interference in 
an arbitrary LCM. This could be given by: 

sum of interference periods during mission time 
mission time 

Here we will assume that the each interference source can be characterized 
by periods of relative frequent interferences (assumed to occur at least once 
every LCM) and interference free periods. The above “sum” denotes the 
total length of the former periods. 

Alternatively, the above probability for interference may, based on some 
other calculations or estimations, be provided by the designer. 

3. Calculate probabilities for all combinations of interference sources. This is 
the product of individual probabilities, since we assume independent sources. 

4. For each combination of interference sources, calculate the probability of an 
error in an LCM caused by the considered combination of interference. This 
is performed by the following procedure 

a) Do until required confidence level is reached^: 

i. Make a random selection of phasings by for each considered source 
making a random selection from the set of discrete time points (up 
to the granularity of the schedule) (0,t/], where t/ is the periodicity 



^ Since we are using random sampling, rather than complete analysis, keeping track 
of the confidence level of the analysis results is essential. However, to simplify this 
first presentation of our approach we will not here consider this further. 




Reliability Modelling of Time-Critical Distributed Systems 



99 



of the interference (as defined in Section 3.2). Each picked sample 
indicates the position vci tf of the corresponding source at time 0 in 
the LCM. 

ii. Perform schedulability analysis (as detailed in subsequent sections) 

iii. If not schedulable then increase deadline jmiss -count. 

, , . deadline jraiss -Count 

IV. failure probability is — 

number oj samples 

5. Total subsystem reliability with required confidence is 1 minus the weighted 
mean of calculated failure probabilities. This weighted mean is the sum of 
the probabilities for interference sources (as calculated in 3 above) multiplied 
with the corresponding failure probability (as defined in 4(a)iv above). 

3 Schedulability Analysis of CAN Messages 

The Controller Area Network (CAN) is a broadcast bus designed to operate at 
speeds of up to 1 Mbps. Each CAN message can contain 0-8 bytes of data. A 
unique 11 bit identifier is associated with each message, which assigns a priority 
to the message. CAN uses deterministic collision resolution to control access to 
the bus. During arbitration, competing stations are simultaneously putting their 
identifiers on the bus. The station with highest priority identifier will win the 
arbitration, and start transmitting the body of the message. 

The CAN message format contains 47 bits of protocol control information. 
The data transmission protocol inserts a stuff bit after five consecutive bits of 
the same value. The frame format is specified such that only 34 of the 47 control 
bits are subject to bit stuffing. Hence, the maximum number of stuff bits in a 
message mi with n bytes of data is j (gj^ce the worst case bit pattern 

is ‘1111100001111...’). This means that a message is transmitted with between 0 
and 24 stuff bits. Hence, the size of a transmitted CAN message is in the range 
47.. 135 bits. The worst case transmission time, denoted Ci, of message mi is 
given by the number of bits to be transmitted for the message multiplied by 
the time required to transmit one bit, denoted Tbu- Hence, for a message with n 
bytes of data Ci = {n * 8 + 47 + 



3.1 Classical CAN Bus Analysis 

Tindell et al. [7] [8] [9] present analysis to calculate the worst-case latencies of 
CAN messages. This analysis is based on the standard fixed priority response 
time analysis for CPU scheduling [1]. 

Calculating the response times requires a bounded worst case queuing pat- 
tern of messages. The standard way of expressing this is to assume a set of traffic 
streams, each generating messages with a fixed priority. The worst case beha- 
viour of each stream is to periodically queue messages. In analogue with CPU 
scheduling, we obtain a model with a set S of streams. Each Si G S is a, triple 
< Pi,Ti,Ci >, where Pi is the priority (defined by the message dentifier), Ti is 




100 



H. Hansson, C. Norstrom, and Sasikumar Punnekkat 



the period and Ci the worst case transmission time of messages sent on stream 
Si- The worst-case latency Ri of a CAN message stream Si is defined by: 

Ri = Ji Qi Ci ( 8 ) 



where Ji is the queuing jitter of the message, i.e., the maximum variation in 
queuing time relative Tj, inherited from the sender task which queues the mes- 
sage, and qi represents the effective queuing time, given by: 



Qi — Bi+ 

jehp(i) 



Qi + Jj + Tbit 

T, 



Cj + E{qi + Ci) 



(9) 



where the term Bi is the worst-case blocking time of messages sent on Si, hp{i) 
is the set of streams with priority higher than Si, Tbu (the bit-time) caters for 
the difference in arbitration start times at the different nodes due to propaga- 
tion delays and protocol tolerances, and E(qi + Ci) is an error term denoting 
the time required for error signalling and recovery. The reason for the blocking 
factor is that transmissions are non-preemptive, i.e., after a bus arbitration has 
started the message with highest priority among competing messages will be 
transmitted, even if a higher priority message is queued before its completion. 



3.2 Our Previous Generalization 

In [5] we present a generalization of the relatively simplistic error model by 
Tindell and Burns [7], specifically addressing multiple sources of errors and con- 
sidering the signalling pattern of individual sources. Each source can typically be 
characterized by a pattern of shorter or longer bursts, during which no signalling 
will be possible on the bus. 

In this paper we will use a slightly simplified version of the error model 
introduced in [5]. Our definition of the error term Ei(t) is based on the following: 

- There are k sources of interference, with each source I contributing an error 
term El(t). Their combined effect is Ei{t) = El(t) \Ef{t)\ ... | E^(t), where 
I denotes composition of error terms. 

- Each source I interferes by inducing an undefined bus value during a cha- 
racteristic time period P . Each such interference will (if it coincides with a 
transmission) lead to a transmission error. If P is larger than Tbu, then the 
error recovery will be delayed accordingly. 

- Patterns of interferences for each source I can independently be specified 
as a sequence of bursts with period Tj, where each group consists of 
interferences of length P and with period 

We can now define Ei{t) for the case of k sources of interference: 

Ept)=El{t)mt)\---\EHt) (10) 



where 



El{t) = Bu\t) * {0\ + max{Q, P - Tbu)) 



( 11 ) 




Reliability Modelling of Time-Critical Distributed Systems 



101 



where 



Bv}{t) = 



t 


1 1 


't mod Tl' 


Ti 

If] 


* n mm n , 


J 


V 





(12) 



Some explanations: 

1. max{0,P — Tbit) defines the length of P exceeding Tbu- 



2 . 

3. 



t mod T) 



is the number of full bursts until t. 

is the number of periods that fit in the last (not completed) 
burst period in t. 

We assume that the overheads Oi are given by: 



Oj = 31 * Tbit + max (Cfc) (13) 

kGhp(i)U{i} 

where 31 *Tbu is the time required for error signalling in CAN and the max-term 
denotes the worst-case retransmission time. 



3.3 Analysis with Random Phasings of Interferences 

The analysis above assumes that interference hits the system in a worst case 
scenario. This assumption will now be relaxed. Our relaxed model will be based 
on 

1. Worst-case phasings of queuings at time 0 in the LCM (actually this could be 
at any time, so why not choose 0) . This introduces some pessimism, since the 
worst case may not occur in every LCM, but is consistent with the assumed 
traffic in the interference free model. 

2. Random phasings of interferences. This can be expressed as an offset from 
the beginning of the LCM to when the first interference hits. For each source 
that hits the LCM, such an offset (offset;) should be “sampled” (as outlined 
in Section 2.2). 

Intuitively, extending the schedulability analysis equations to also cover ran- 
dom phasings of interferences seems the right approach to solve our problem. 
One candidate solution is the following minor modification of the schedulability 
formulas, by replacing, E\{t) with 

E\{t) = R'u*(max(0,t — offset;) * -b max{Q,f — Tbu)) (14) 

Unfortunately, there is a risk that the resulting analysis underestimates the 
probability of deadline misses, since it only considers the first invocation of each 
message in the LCM, and since the interference now may start later, a subsequent 
invocation may experience larger latency than the first one. To avoid optimism 
in the analysis we advocate other methods below, even though we think the 




102 



H. Hansson, C. Norstrom, and Sasikumar Punnekkat 



schedulability analysis approach deserves further investigation and evaluation 
before being dismissed. 

The following three approaches provide successively refined estimates of the 
probability of a deadline miss: 

1. Use 1 if any message miss deadline under critical instant assumptions other- 
wise 0. This corresponds to the original analysis. But keep in mind that the 
result of a deadline miss here will not be 0, since we will only conclude a 
deadline violation in LCMs subjected to interference. 

2. Add up all the idle time slots during LCM giving say I. If number of idle 
slots is ni and error interference duration is E, calculate I — nj x E (this is 
to remove the possibility of an error burst spilling to a busy period). Now 

gives a crude, but better approximation. 

3. Our final, and most accurate method, is based on a simulation of the message 
transfer and interference in the LCM. The basis is that for each combina- 
tion of samples we will get a static scenario with fixed release times and 
parameters. The most efficient analysis is then probably a straight-forward 
simulation. Simulation possibly will provide best estimates for the proba- 
bility of deadline miss, since it corresponds to actually running the system 
during an LCM. This allows more restricted traffic patterns to be handled 
(possibly better corresponding to reality). 

3.4 Effects of Bit Stuffing 

Another important factor which contributes pessimism is bit-stuffing overheads 
assumed for analysis. As mentioned in section 3, the worst case number of stuff 
bits is assumed to be ^nd is accounted in the worst case transmis- 

sion time, Ci- In applications where all the probable message bit patterns are 
known a priori, we can use that information to derive a new worst case trans- 
mission time. For systems where there is considerable amount of uncertainty in 
message patterns, we still can improve our results by using simple probabilistic 
estimates of the worst case transmission times. Since any bit can be ‘1’ or ‘O’, 
we can by assuming the probability for each to be | estimate the probability for 
adding a certain number of stuff bits, e.g., the probability of adding one stuff bit 
after the 5 first bits subjected to bit-stuffing is (^)^. By providing probabilities 
for different message lengths (and in turn for the worst case transmission times), 
the simulation results can give closer match to the real scenario. Details of these 
calculations are provided in [4]. 

4 Example : A Distributed Braking System 

We now present a case study of a simplified Antilock Braking System (ABS), 
where each separate brake is controlled by a computer. Furthermore, there is one 
computer that controls the brake pedal. All nodes are connected by a CAN-bus 
(see Figure 2). The application is a distributed control algorithm, which calcula- 
tes the brake force for each wheel depending on the brake pressure achieved from 




Reliability Modelling of Time-Critical Distributed Systems 



103 



the driver. Therefore, each wheel-computer has to receive information about the 
state of the other wheels, to be able to make correct calculation and actuation. 
Thus each wheel is equipped with a sensor that monitors the rotation of the 
wheel. Each node sends the monitored values periodically. 




Fig. 2. Typical Computer Network in a Car with ABS 



Since ABS is a subsystem of the entire vehicle system, we assume an appro- 
priate reliability figure (say 10“®) to be attained by the ABS. This figure is in 
fact mandated from an overall system reliability requirement of 10“^. 

Table 1 (left half) specifies a typical subset of messages sent through CAN in 
this simplified ABS and their timing details (in ms). The timing parameters are 
typically requirements derived from the vehicle dynamics by a control engineer. 
Priority 1 is assumed to be the highest and 6 is the lowest. We also assume that 
the CAN bus operates at 250 Kbps. 



4.1 Interference Characteristics 

In our example of the communication subsystem of a vehicle, two typical sources 
of interference could be from a mobile phone lying inside the vehicle or Radar 
transmissions from ships while the vehicle crosses bridges. Mobile phones (such as 
GSM-phones) typically operate at 900MHz - 1800 MHz frequencies. The carrier 
when transmission occurs is for a period of 500 fj,s duration out of a 4ms cycle. 
Also each half-an-hour or so, the mobile phone will send signals to the base 
station. In addition, on a moving vehicle extra signals are sent when the phone 
switches between base-stations. We assume 4 interferences in a burst and typical 
interval between bursts to be 30 secs. For the interference from radar, we assume 
the duration to be 1 ms with 1 interferences per burst and interval between bursts 
to be 1000 secs. 



104 H. Hansson, C. Norstrom, and Sasikumar Punnekkat 

4.2 Reliability Analysis Results 

A typical mission time for an ABS could be say less than 10 hours and our 
analysis is based on worst case scenarios in an LCM (which typically is in the 
order of a second). Table 1 shows the results of response time analysis. The 
column headed by ‘0’ shows the message latencies under no errors, where as 
columns headed by ‘1’, ’2’, and ’3’ show message latencies under interferences 
from source-1 (mobile phone), source - 2 (radar), and both sources respectively. 
A indicates a deadline miss situation. 



1 A typical Message Set I 


Msg ID 


Priority 


Ti 


Di 


size 


Ci 


1 Response Time | 














(0) 


(1) 


(2) 


(3) 


OPERATOR- 1 


1 


8 


8 


8 


0.54 


1.08 


2.24 


2.74 


3.908 


ABS-1 


2 


4 


4 


8 


0.54 


1.64 


2.78 


3.28 


* 4.448 


ABS-2 


3 


4 


4 


8 


0.54 


2.16 


3.32 


3.82 


* 6.692 


ABS-3 


4 


4 


4 


8 


0.54 


2.70 


3.86 


* 4.36 


* 7.772 


ABS-4 


5 


4 


4 


8 


0.54 


3.24 


* 4.40 


* 6.52 


* 12.176 


OPERATOR-2 


6 


15 


15 


8 


0.54 


3.78 


7.10 


7.60 


* 16.030 



Table 1. Response Time Analysis- Normal and under Faults 



Note that, 5 out of 6 messages miss their deadlines under the worst case 
phasing of combined interferences from both sources. We conducted simulation 
runs with and without random bit-stuffing and the results are shown in Table 
2. Combining the obtained failure probabilities with resonable assumptions on 



Interference 

Sources 


Total 

Messages 


Missed 

(worst-case) 


Failure 

Probability 


Missed 

(random) 


Failure 

Probability 


Source-Il 


15064 


5 


0.00033 


3 


0.000199 


Source-12 


15064 


41 


0.00272 


14 


0.000929 


11 and 12 


15064 


44 


0.00292 


16 


0.001062 



Table 2. Simulation results for worst-case and random bit-stuffing 



occurence of interferences, we get the overall failure probability to be of the order 
5x10“^^, which is quite negligible in relation to the admissible failure probability 
of Details on these calculations and the assumptions are provided in [4]. 

5 Conclusions 



We have presented results from ongoing work to develop a framework that allows 
controlled relaxation of the timing requirements of safety-critical hard real-time 





Reliability Modelling of Time-Critical Distributed Systems 



105 



systems. By integrating hard real-time schedulability with the reliability ana- 
lysis normally used to estimate the imperfection of reality, we obtain a more 
accurate reliability analysis framework, which can provide reasonable arguments 
for making design trade-offs, e.g., choosing a slower (and less expensive) bus or 
CPU, even though the timing requirements are violated in some rare worst-case 
scenario. 

Using traditional schedulability analysis techniques, the designer will have 
no other choice than to redesign the system (in hardware, software or both), if 
the emphasis is only on worst case interference hits. However, by resorting to 
our new analysis, if the probability of such an extreme situation arising is very 
low (in relation to the reliability requirements), then the designer may very well 
avoid such a costly step. 

Further research planned on our approach include, 

- Extensions by stochastic modeling of external interferences, and distributions 
of execution times of tasks, jitter, periods for sporadic tasks, etc. Some of 
these extensions require dependency issues to be carefully considered. 

- A comparison of the schedulability and simulation based approaches. 

References 

1. N. C. Audsley, A. Burns, M.F. Richardson, K. Tindell, and A.J. Wellings. App- 
lying New Scheduling Theory to Static Priority Pre-emptive Scheduling. Software 
Engineering Journal, 8(5):284-292, September 1993. 

2. A. Burns. Preemptive Priority Based Scheduling: An Appropriate Engineering Ap- 
proach. Technical Report YCS 214, University of York, 1993. 

3. A. Burns, S. Punnekkat, L. Strigini, and D.R. Wright. Probabilistic scheduling 
guarantees for fault-tolerant real-time systems. Proceedings of DCCS-7,IFIP Inter- 
national Working Conference on Dependable Computing for Critical Applications, 
California, January 1999. 

4. H. Hansson, C. Norstrom, and S. Punnekkat. Reliability Modelling of Time-Critical 
Distributed Systems. Technical report, MRTC, Malardalen University, July 2000. 

5. S. Punnekkat, H. Hansson, and C. Norstrom. Response Time Analysis under Er- 
rors for CAN. Proceedings of IEEE Real-Time Technology and Applications Sympo- 
sium(RTAS), page To appear, June 2000. 

6. L. Sha, R. Rajkumar, and J.P. Lehoczky. Priority Inheritance Protocols: An Ap- 
proach to Real-Time Synchronization. IEEE Transactions on Computers, 39(9):1175- 
1185, September 1990. 

7. K. W. Tindell and A. Burns. Guaranteed message latencies for distributed safety- 
critical hard real-time control networks. Technical Report YCS229, Dept, of Com- 
puter Science, University of York, June 1994. 

8. K. W. Tindell, A. Burns, and A. J. Wellings. Calculating Controller Area Net- 
work (CAN) Message Response Times. Control Engineering Practice, 3(8):1163-1169, 
1995. 

9. K. W. Tindell, H. Hansson, and A. J. Wellings. Analysing Real-Time Communica- 
tions: Controller Area Network (CAN). Proceedings 15th IEEE Real-Time Systems 
Symposium, pages 259-265, December 1994. 

10. J. Xu and D. L. Parnas. Priority scheduling versus pre-run-time scheduling. Real- 
Time Systems Journal, 18(1), January 2000. 




A Methodology for the Construction of 
Scheduled Systems 



K. Altisen, G. Gofiler, and J. Sifakis 

Verimag, 2 av. Vignates, 38610 Gieres, France 
{altisen, goessler, sifakisjOimag. fr 



Abstract. We study a methodology for constructing scheduled systems 
by restricting successively the behavior of the processes to be scheduled. 
Restriction is used to guarantee the satisfaction of two types of con- 
straints: schedulability constraints characterizing timing properties of the 
processes, and constraints characterizing particular scheduling algorithms 
including process priorities, non-idling, and preemption. 

The methodology is based on a controller synthesis paradigm. The main 
results deal with the characterization of scheduling policies as safety con- 
straints and the simplification of the synthesis process by applying a com- 
posability principle. 



1 Introduction 

Scheduling coordinates the execution of application and system activities, so as 
requirements about their temporal behavior are met. Guaranteeing correctness 
of schedulers is essential for the development of dependable real-time systems. In 
many application areas, well established theory and scheduling algorithms have 
been successfully applied to real-time systems development. 

Existing scheduling theory is limited because it requires the system to fit into 
the mathematical framework of the schedulability criterion (e.g. all tasks are 
supposed periodic, worst case execution times are known). Studies to relax such 
hypotheses have been carried out but they generalize one hypothesis at a time, 
and no unified approach has been proposed. 

To overcome limitations of scheduling theory, it is important to study its 
connections to specification theory and take advantage of their complementarity 
[8, 13,4]. The specification based approach consists in building a timed model of 
the scheduled system or of an abstraction of it. Then, timed analysis tools are 
used either to check that the exact model meets scheduling requirements or to 
extract from the abstraction a scheduler [6, 10]. 

A major difficulty in applying this approach is the generation of the timed 
model from some description of the scheduling method. In fact, scheduling deals 
with the very dynamic nature of real-time systems, and behavior modeling re- 
quires a deep understanding of mechanisms such as priorities and preemption, as 
well of concepts such as urgency, idling, timeliness. 

In this paper we propose a methodology for modeling scheduling algorithms 
that constructs compositionally the scheduled system from a global timed model 
based on 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 106-120, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




A Methodology for the Construction of Scheduled Systems 107 



1. A functional description of the processes to be scheduled, their resources, and 
the associated synchronization and management constraints; 

2. Timing requirements added to the functional description and relating in par- 
ticular execution speed with the dynamics of the external environment; 

3. A description of a scheduling algorithm consisting of three types of require- 
ments about 

(a) Fixed or dynamic priorities, for choosing between pending requests of the 
processes, 

(b) Possibility of idling, meaning that the scheduler may not satisfy a pend- 
ing request anticipating the satisfaction of a forthcoming higher priority 
request, 

(c) Preemption, that is, for a given preemption order between processes, a 
process of lower priority is preempted when a process of higher priority 
raises a request. 

In previous papers [3, 2] we have shown how a functional description can be ex- 
tended into a timed one by preserving progress properties. In this paper we study 
a methodology for constructing a scheduled system from scheduling requirements 
and a timed specification of the processes to be scheduled. The methodology is 
based on the controller synthesis paradigm [11,9, 1]. A scheduler is considered as 
a controller of the processes to be scheduled which restricts their behavior by trig- 
gering their controllable actions. The restricted behavior must respect the timing 
constraints of the processes as well as constraints characterizing the scheduling 
requirements. 

We have shown in [1] how schedulers can be computed by applying a synthe- 
sis algorithm to timed automata. The synthesis algorithm computes iteratively 
from a constraint K characterizing scheduling requirements, the maximal con- 
trol invariant K', K' => K. The latter denotes the set of states from which K is 
guaranteed. The behavior of the scheduled system is obtained by restricting the 
controllable actions of the processes so as to respect the control invariant K' . 

The application of synthesis techniques is limited for two reasons. First, the 
practical complexity of the synthesis algorithm is high even in the case of timed 
automata without scheduling policy constraints. Second, scheduling with preemp- 
tion requires the use of automata with integrators [5] which implies that iterative 
computation of control invariants may not terminate. 

The proposed methodology allows to decompose the global controller syn- 
thesis procedure into the application of simpler steps. At each step a control 
invariant corresponding to a particular class of constraints is applied to further 
restrict the behavior of the system of processes to be scheduled. The presented 
results can be summarized as follows: 

1. Global scheduling requirements can be characterized by a constraint K of the 
form K = Kaigo A Kjched where Kaigo specifies a particular scheduling algo- 
rithm, and Ksched characterizes schedulability requirements of the processes. 
Furthermore, Kaigo is a conjunction of constraints about the scheduling policy, 
the possibility of non-idling, and preemption; 




108 K. Altisen, G. GoBler, and J. Sifakis 



2. A step of the method corresponds to the computation of a controller for some 
constraint. The control invariant corresponding to a constraint can be com- 
puted in a straightforward manner (without iterative fixpoint computation) ; 

3. The scheduled system can be obtained by successive applications of steps re- 
stricting the process behavior by control invariants implying all the schedul- 
ing constraints, provided that some composability conditions are satisfied. 
In fact, the restriction by a control invariant does not necessarily preserve 
previously imposed control invariants. 

The methodology allows an incremental construction of a scheduled system, 
or of an abstraction of it, if some steps fail. 

The paper is composed of two sections. The first section presents basic re- 
sults about control invariants and their composability. The second section shows 
how scheduling requirements can be expressed as constraints which are control 
invariants in some cases. The application of the methodology is illustrated by 
examples. 

2 Control Invariants and Composability 

2.1 Timed System 

To model scheduling algorithms, we use reactive timed systems with two kinds of 
actions as in [1]: controllable actions that can be triggered by the scheduler, and 
uncontrollable actions that can be considered as internal actions of the processes 
to be scheduled. Controllable actions are typically resource allocations and pro- 
cess preemption, while uncontrollable actions are process arrival and termination. 

Both controllable and uncontrollable actions are submitted to timing con- 
straints expressed in terms of real-valued variables called timers. The derivatives 
of timers may take the values 0 or 1, as specified by a boolean vector. 

Definition 2.1. (X-constraint). Let X be a finite set of timers, {xi, . . . ,Xm}, 
real- valued variables defined on the set of non-negative reals IR-|- . A predicate C 
generated by the grammar C ::= xH^d \ x — yH^d | C A C | -iC, where x,y £X, d 
is an integer, and # € {<, <}, is called a X-constraint. 

Definition 2.2. (Falling Edge). Let C be a X-constraint, and 6 be a boolean 
derivative vector of {0,1}™. The closed (resp. open) falling edge of C w.r.t. b, 
written Xf,C (resp. is defined as Va; € IR™, 

lhC{x) C{x) A > 0 . Vt' G (0, t] . ^C{x -b t'h) 

lt,C{x) - C{x) A > 0 . Vt' G (0,t] . C{x - t'h) . 

Example 2.1. Let X = {a;i,a; 2 } be the set of real valued variables. C = x\ <3 
and C = 2 < x\ < &Ax\ —X 2 < 4 are X-constraints. For b = (1, 1) and b' = (1, 0), 
we have: 

{.jC = (xi = 3) Xi,C = false -I-jC" = false iftC" = (xi = 6) A (x 2 > 2) 

= {xi = 3) XyC = false Jj,C" = {xi — X 2 = X}XyC = {xi = 6) A {x 2 > 2) 

A (x2 < 2) 




A Methodology for the Construction of Scheduled Systems 109 



Definition 2.3. (Timed System). A timed system is 

1. An untimed labeled transition system (S,A,T) where S is a finite set of 
control states; A is a finite vocabulary of actions partitioned into two sets of 
controllable and uncontrollable actions noted A'^ and A’^;TCSxAxSis 
an untimed transition relation; 

2. A finite set of timers X = {x\, . . . , a;™}, as in definition 2.1; 

3. A function b mapping S into {0, 1}™. The image of s € S by 6 denoted bg is 
a boolean derivative vector; 

4. A labeling function h mapping untimed transitions of T into timed transi- 
tions: h(s,a,s') = (s,a, g,T,r, s'), where the guard g is & X-constraint; the 
reset r C X is a set of timers to be reset; r € {A,<5, e} is an urgency type, 
respectively lazy, delayable, eager. 

Semantics. A timed system defines a transition graph constructed as 

follows. V = S X IR™, that is, vertices (s,x) are states of the timed system. 

The set f C V x (AUIR^) x V of the edges of the graph is partitioned into three 
classes of edges: S" controllable, uncontrollable, and S* timed, corresponding 
respectively to the case where the label is a controllable action, an uncontrollable 
action, and a (strictly) positive real. 

Given s € S, let J be the set of indices such that {(s, Oj, is the set of 

all the untimed transitions departing from s. Also let h(s,aj,Sj) = (s,aj,gj,Tj, 
rj,Sj). For all j € J, {{s,x),aj,{sj,x[rj])) € U iff gj{x) and x[rj] is the 
timer valuation obtained from x when all the timers in rj are set to zero and the 
others are left unchanged. 

To define £* , we use the predicate {p, called time progress funetion. The no- 
tation (p((s,x),t) means that time can progress from state (s,x) by t. 

( , ^ C Vi' G [0,t) . A 

ip((s,x),t) - A ^ + 

teJ [ Tj = e => Vi' G [0,t) . ^gj{x + t'bg) 

If p,{{s ,x),t), then {{s,x),t,{s,x + tbg)) G where x + tbg is the valuation 
obtained from x by increasing by t the timer values for which bg elements are 
equal to one. 

The above definition means that at control state s, time cannot progress 
whenever an eager transition is enabled, or beyond the falling edge of a delayable 
guard. 

We will usually denote by TS a timed system. TS'^ (resp. TS*') represents the 
timed system composed of the controllable (resp. uncontrollable) transitions of 
TS only. 

Proposition 2.1. If and (p'' are respectively the time progress functions 

of TS, TSG and TS“ then <p = K . 

Example 2.2. (A Periodie Proeess). Let us model a periodic non-preemptible 
process P as a timed system. P is of period T > 0 and uses the CPU for an 
execution time E. It also has a relative deadline oi D {D <T). 




110 K. Altisen, G. GoBler, and J. Sifakis 



As shown in Fig. 2, the timed system 
has three control states, s, w, and e where 
P is respectively sleeping, waiting for the 
CPU, and executing on the CPU. The ac- 
tions a, b, and / stand for arrive, begin, 
and finish. The timer x is used to measure 
execution time while the timer t measures 
the time elapsed since the process has ar- 
rived. In all states, both timers progress. 

The only controllable action is b. Fig. 2: A periodic process. 

By convention, transition labels are of 

the form a^,g'^,r, where x can be u (uncontrollable) or c (controllable), and r 
is an urgency type. The set r is omitted if it is empty. 

Notice that since the transition b is delayable, the process might wait for a non- 
zero time although the CPU is free: idling is permitted. A non-idling process is 
modeled by changing the urgency type of the transition b to eager (see example 2.5 
for further details). A preemptive periodic process is modeled in section 3.3. 

2.2 Restriction and Control Invariants 

Definition 2.4. (Constraint). Given a timed system with a set of timers X and 
a set of control states {si, . . . , Sn}, a constraint is a state predicate represented 
as an expression of the form Vr=i ^ '"^here C, is a X-constraint and s, is 
(also) the boolean denoting presence at state s,. 

Definition 2.5. (Restriction). Let TS be a timed system and K be a constraint. 
The restriction of TS by K denoted TS/K, is the timed system TS where each 
guard g oi & controllable transition (s, a, g, r, r, s'), is replaced by 

g'(x) = g(x) AK(s',x[r]) . 

Notice that in the restriction TS/K, the states reached right after execution 
of a controllable transition satisfy K. Moreover, it follows from the definition that 
(TS/Ki)/K 2 =TS/(Ki AK 2 ). 

Definition 2.6. (Proper Invariant) . Let TS be a timed system and K be a con- 
straint. We say that K is a proper invariant of TS, denoted by TS 1= inv(K), if 
K is preserved by the edges of £, i.e., V(s,a;) . K(s,a;) \/{{s,x),^,{s' ,x')) € 

£ . K{s',x'). 

Proper invariants, called simply invariants for closed systems, are constraints 
preserved by all the transitions of the system. We use the term “proper” to 
distinguish them from control invariants introduced in the following definition. 
Control invariants are constraints that are satisfied by the restricted system. 

Definition 2.7. (Control Invariant). Let TS be a timed system and K be a 
constraint. K is a control invariant of TS if TS/K 1= inv(K). 





A Methodology for the Construction of Scheduled Systems 111 



Proposition 2.2. If K is a proper invariant of a timed system TS, then K is a 
control invariant of TS. 

This property follows from the trivial observation that if TS and TS/K are ini- 
tialized in K, then they have the same behavior. However, notice that control 
invariants are not proper invariants, in general. 

Proposition 2.3. For any timed system TS and constraint K such that TS*' 1= 
inv(K), K is a control invariant of TS (i.e. TS/K 1= inv(K)). 

Proof (sketch). Assume K{s,x) for some state (s,x). To prove TS/K 1= inv(K) it 
must be shown that K is preserved in TS/K by (1) controllable, (2) uncontrol- 
lable, and (3) timed edges of TS/K. By construction of TS/K, (1) is true. From 
TS*' 1= inv(K), (2) and (3) follow. 

Definition 2.8. (Timed System of Processes) . A timed system of processes is a 
timed system TS = (S, A, T, X, 6 , /i) obtained by composition of processes where 
a process Pi is a timed system (S,, A,, T,, X,, bi,hi). TS is the timed system of n 
processes {Pi, . . . , P„) if 

S = Si X . . . X S„ ; A = Ai U . . . U A„ ; X = Xi U . . . U X„ ; 

For s = (si . . . s„) e S and x € X,, = bi^si [a;]; 

For s = (si . . . Sj . . . s„), and s' = (si . . . s' . . . s„) € S, 
t = (s,aj,s') G T - tj = (sj,aj,s') G Tj 

h{t) = {s,ai,gi,Ti,ri,s') - hi{U) = {si,Oi, gi,Ti,ri, s'f) . 

We assume that processes have disjoint sets of control states, and timers. More- 
over, we accept that guards are general constraints on timers and control states 
as in the definition 2.4. 

Example 2.3. (Mutual Exclusion) . Consider a timed system of n periodic non- 
preemptible processes {Pi,...,P„}, instances of the generic process of Fig. 2, 
and the constraint 

l^mutex — V ~'Cj 

expressing mutual exclusion. It is trivial to check that K^utex is a control invari- 
ant, as TS" 1 = inv(Kmutex)- In fact, K^utex is time invariant and is preserved by 
uncontrollable transitions. 

If TS is the timed system of two processes — as in Fig. 2 for which the 
parameters {E, T, D) are equal to (5, 15, 15) and (2, 5, 5), resp. — and if K^utex = 
-lei V - 162 , then TSi = TS/K^utex is obtained by restricting the controllable 
guards and gi,^ to 

9 bi = (tl < Pi — El) A -162 = (tl < 10 ) A -162 
g'b 2 = (P < P2 — P2) A -161 = (<2 < 3 ) A -161 . 




112 K. Altisen, G. GoBler, and J. Sifakis 



2.3 Control Invariants and Synthesis 

Following ideas in [11], synthesis is used to partially restrict the non-determinism 
of a system so as it satisfies a given invariant. 

Problem 2.1. (Synth). Solving the synthesis problem for a timed system TS 
and a constraint K amounts to giving a non-empty control invariant K' of TS 
which implies K, i.e. K' => K,TS/K' 1= inv(K'). 

We assume that the processes to be scheduled and their timing constraints 
are represented by a timed system of processes TS. Furthermore, we consider 
that scheduling requirements can be expressed as a constraint (safety property) 
K. A scheduled system can be obtained by solving the synthesis problem for TS 
and K, as explained in [1]. If K' is a control invariant implying K, then TS/K' 
describes a scheduled system. 

We assume that the constraint K is in general the conjunction of two con- 
straints K = Kaigo A Ksched- Kaigo is an optional constraint characterizing a partic- 
ular scheduling algorithm. We provide in section 3, a general framework for the 
decomposition of Kaigo and the modeling of different scheduling policies. 

Ksched expresses the fact that the timing requirements of the processes are 
satisfied. We consider that the processes to be scheduled are structurally time- 
lock-free [2]. This property means that time always eventually progresses. It is 
implied by the fact that at any control state, if no action is enabled then time 
can progress, and the requirement that in any circuit of the control graph a timer 
is reset and tested against some positive lower bound. For example, the periodic 
process of example 2.2 is structurally timelock-free. 

Notice that structural timelock-freedom is preserved by restriction. For time- 
lock-free timed systems, Ksched can be formulated as a constraint expressing the 
property that each process always eventually executes some action. This property 
implies fairness of the scheduling algorithm. 

Definition 2.9. (<>). Let C be a X-constraint, s € S a control state, and k € 
IN U {oo}. We will use the notation 

{OlC){x) =3t£ [0, k] . C{x + tbs) 

to express the property “eventually C within k in s” . If the state s is clear from 
the context, we write Ok instead of Of.. We use (OC){x) for > 0 . C{x + 1). 

For a timed system of processes as in definition 2.8, 

Ksched — f\ Kschedi where Kschedi — \J S A { \J ^9a) ■ 

Pi sESi (s,a,s')ETi 

It can be shown that in general, Ksched is not a control invariant. We have shown 
in [1] how maximal schedulers for timed automata and their schedulability con- 
straints can be computed. The synthesis algorithm has been implemented in the 
Kronos tool. 




A Methodology for the Construction of Scheduled Systems 113 



Example 2.4. (Schedulability). The schedulability constraint for the timed sys- 
tem of n periodic processes TS as in example 2.3 is 

Ksched = /\{si A Oga, V Ci A Ogf. y Wi A Og^y.) . 

Pi 



We consider the timed system of two processes described in example 2.3 where 
the mutual exclusion constraint has been applied. We have 



Ksched 



Si A ti < 15 

V ei A a:i < 5 ^ 

V wi A ti < 10 



S2 A <2 <5 

V 62 A 2:2 < 2 

V W2 A <2 < 3 . 



The maximal control invariant implying Ksched computed by Kronos is 






(si A S 2 A ti <= 15 A <2 <= 5) 

V (wi A S2 A (<2 <= 3 A ti < 10 V <2 <= 5 A ti <= <2 + 3)) 

V (si A W 2 A ti <= 15 A <2 <= 3) 

V (ei A S 2 A <2 <= 5 A a:i <= 5 A ti <= a:i -h 10 A <2 <= Xi + 3) 

V (wi A W2 A (ti <= 8 A <2 <= 1 V <2 <= 3 A ti <= <2 + 3)) 

V (si A 62 A ti <= 15 A X 2 <= 2 A <2 <= X 2 + 3) 

V (61 Aw 2 xi <= 5 A ti <= a:i -h 10 A <2 + 2 <= xi) 

V (wi A 62 A {x 2 <= 2 A ti <= 2:2 + 8 A <2 <= 2:2 + 1 V 

X2 <= 2 A tl <= #2 + 3 A #2 <= X2 + 3)) . 



In the rest of the paper, we show how to construct control invariants for some 
frequently used scheduling algorithms without fixpoint computation. 



2.4 Control Invariant Composability 

Contrary to proper invariants, control invariants are not composable by con- 
junction. In general, it can not be inferred from TS/K, 1= inv(K,),i = 1,2 that 
TS/(Ki A K 2 ) 1= inv(Ki A K 2 ). We study a notion of control invariant compos- 
ability. 

Definition 2.10. (Composable Invariant). Let TS be a timed system and Ki be 
a constraint. Ki is a composable invariant of TS if for all constraints K 2 , Ki is a 
control invariant of TS/K 2 (i.e. if TS/(Ki A K 2 ) 1= inv(Ki)). 

Proposition 2.4. Let TS be a timed system and Ki be a constraint on TS. Ki 
is a composable invariant of TS iff TS*' 1= inv(Ki). 

Proof. Let Ki be a composable invariant of TS. By applying definition 2.10 with 
K 2 = false, we obtain: TSf false = TS*' 1= inv(Ki). 

Conversely, assume that TS*' 1= inv(Ki) and let K 2 be some constraint. We 
show that TS/(Ki AK 2 ) 1= inv(Ki). Let (s,x) be a state of TS such that Ki(s,a;). 
(1) If there exists a controllable edge {{s,x),ac,{s' ,x')) in the transition graph 
of TS/(Ki A K 2 ), then by definition 2.5 of restriction, (Ki A K 2 )(s',a;'), thus 
Ki(s',a;'). (2) An uncontrollable edge {{s,x),au,{s' ,x')) of TS/(Ki A K 2 ) is also 




114 K. Altisen, G. GoBler, and J. Sifakis 



an uncontrollable edge of TS*', thus Ki{s',x'). (3) Let (^(KiaKs) be the time 
progress function of TS/(Ki A K 2 ). According to the property 2.1, we have 

‘P{KiAK2) = ‘^^(KiAKa) ^ “^rKiAKa) = ‘^(KiAKa) ^ ' 

If {{s,x),t, {s,x + tbs)) is a timed edge of TS/(Ki A K 2 ), then it is also a timed 
edge of TS*' because (^(KiaKs) = “^(KiAKa) ^ Thus, Ki(s,a; + tbg) from TS*' 1= 
inv(Ki). 

Corollary 2.1. For a timed system TS and constraints Ki and K 2 , TS*' 1= 
inv(Ki) and (TS/Ki)/K 2 N inv(K 2 ) implies that TS/(Ki A K 2 ) N inv(Ki A K 2 ). 

That is, if Ki is composable and if K 2 is a control invariant of TS/Ki then 
(Ki A K 2 ) is control invariant of TS. 

This corollary justifies the incremental methodology for restricting a timed 
system. To impose a control invariant Ki A K 2 on TS, if Ki is a composable 
invariant of TS, the restriction by a control invariant K 2 does not destroy the 
invariance of Ki . 



Example 2.5. (Non-idling Constraint). A scheduling algorithm is said to be 
non-idle if the CPU cannot remain free when there is a pending request. Let us 
consider the timed system of n processes as in example 2.3. As TS*' 1= inv(Kmutex), 
Kmutex is composable which means that K^utex is a proper invariant of any system 
obtained by restriction of TSi = TS/K^utex- 

In order to model non-idling, as remarked in example 2.2, all transitions 6, 
must have the urgency type eager. The non-idling constraint Kpon-idie specifies 
that an enabled 6, action is fired as soon as the CPU is free. 

Upon-idle — \J V Xj — Ej) V V Wj A tj — 0) 

Pi Pi 



means that in a non-idling system, if no process Pi is executing or has just finished 
its execution, then any process Pj is either sleeping or waiting for zero time. 

It can be shown that Kpop-idie is a proper invariant of TSi. However, it fails 
to be composable, in general. For the timed system of two processes described in 
example 2.3, the constraint Kpop-idie becomes 



K 



non-idle 



(ei V 62) V (xi = 5 \/ X2 = 2 ) 

V (si V wi A ti = 0 ) A (s2 V W2 A <2 = 0 ) . 



Notice that TSi/Kpop_idie = TSi, that is, restricting by Kpop-idie does not 
change controllable transitions of TSi. It is easy to check that TSi/(Kpop_idie A 
Ksched) ¥ inv(Kpop-idie): consider for instance the eager transition b\ from the con- 
trol state (W 1 S 2 ) to (eiS 2 ) with guard g^^ = U < 10 A ^2 < 3. When the system 



reaches the state (W 1 S 2 ) with timer values (U =0,^2 = 4), the action b\ is not 
enabled although the CPU is free due to the restriction t^ <2> imposed by Kjched- 
Thus, Kpop-idie is violated. 




A Methodology for the Construction of Scheduled Systems 115 

Imposing Kjched has destroyed the property of the system to be non-idle. 
Thus the non-idling constraint is not composable. This is a consequence of the 
observation that a given scheduling problem with an idling solution may have no 
non-idling schedule. 

The notion of composability described in this section allows to apply restric- 
tions sequentially to build a system more and more close to the correct scheduler 
at each step. 

3 Modeling Scheduling Algorithms 

Timed systems with priorities are timed systems of processes with an associated 
set of priority orders on actions. They have been defined and studied in [3,2]. 
We show how to model scheduling algorithms by specifying a timed system with 
priorities and that applying priorities is equivalent to restricting by a composable 
invariant. 



3.1 Timed Systems with Priorities 

Definition 3.1. (Priority Order). Let -<C A x (IN U {oo}) x A be a relation. 
ai -<k 02 is written for {a\,k, 02) €-<. The relation -< is a priority order if 
\fk e IR-i- U {00}, 

1. -<k IS & partial order; 

2. oi -<k 02 => yk' <k . oi -<k' 02] 

3. oi -<k 02 A 02 -<i 03 => oi -<k+i 03. 

Definition 3.2. (Timed System with Priorities). A timed system with priorities 
(TS,pr) is the timed system of processes TS equipped with a priority rule, i.e., 
a finite set of pairs pr = {(C*, -<*)}*, where -<* is a priority order, and C* is a 
X-constraint that specifies when the priority order applies, such that 

1. C* A folse => -<* U is a priority order; 

2. No uncontrollable action is dominated in -<*; 

3. (C*,-<*) e pr and (o,k,b) €-<* imply that transitions labeled by o do not 
reset any timer occurring in C*. 

For each state s € S, let {(s, a,, be the set of the transitions departing 

from s, and h(s,Oi,Si) = (s, Oi, gi,Ti,ri, Si). The timed system with priorities 
(TS,pr) represents a timed system TS' obtained from TS by replacing the guards 
Qj of TS by g) defined as follows: 

o'j = 9j ^ A ^ A ■ 

(C,^)epr 



This formula says that an action Oj is allowed if there is no transition a, leaving 
s that has priority over oj, and that will become enabled within a delay of k. 




116 K. Altisen, G. GoBler, and J. Sifakis 



Example 3.1. (The edf Policy). Consider the timed system TSi of n non- 
preemptible periodic processes, on which K^utex has already been applied, as 
in example 2.3. 

We show how the basic earliest deadline first (edf, [7]) mechanism can be 
specified by using a priority rule. A scheduler follows an edf policy if the CPU is 
granted to the waiting process that is closest to its relative deadline. 

The edf policy is partially specified by 

P^edf “ {(^« ~ U < Dj ~ tj, {bj -<o , 

i.e., whenever there are two processes Pi and Pj waiting for the CPU, the action 
bi has immediate priority over the action bj if Pi is closer to its relative deadline 
than Pj (namely, Di — ti < Dj — tj). 

It is easy to check that pr satisfies the requirements of definition 3.2. In 
particular, note that the constraints Di — ti < Dj — tj define a partial order 
on the set of bi actions. The complete specification of the edf policy is given in 
example 3.3. 



3.2 Priorities as Restriction 

We show that applying a priority rule amounts to restricting by a particular 
constraint. To obtain this result, we construct from (TS,pr) a timed system 
TS' that is strongly equivalent to TS, and a constraint Kp^ such that (TS,pr) is 
strongly equivalent to TS'/Kp^- Strong equivalence means that for any state of TS 
there exists a state of TS' such that the transition graphs are strongly bisimilar 
from these states, and conversely. The construction has only a theoretical interest 
and is used to show that Kp^ is a composable invariant. 

Let (TS,pr) be a timed system with priorities. In order to interpret priorities 
on TS as a constraint, we have to identify the states reached right after firing a 
restricted transition. 

For this we transform TS = (S,A,T, \ / 

X, b,h) into a strongly equivalent timed \ / 

system TS' = (S', A, T', X, 6', h') with S'C VV 

S U (S X A), by iterative application of V J ~ 

a state splitting procedure which creates / \ 

for each transition a unique target control /as \ «4 
state. * \ 

For each state s € S with an incident Fig. 4: The splitting procedure, 
transition of the form t = (ss, Oj,s) where 

ss e S' and Oj £ A, the splitting procedure removes t and creates a new transition 
t' = (ss, Oj, (s, Oj)). t' is labeled as t with in addition a reset of a new timer Zj. 
Notice that in TS' the set of states reached right after the execution of Oj is 
characterized by ((s, Oj) A Zj = 0). For all states s £ S', we take b'g[zj] = 1. 





A Methodology for the Construction of Scheduled Systems 117 



Proposition 3.1. Let (TS,pr) be a timed system with priorities, and TS' be 
the result of the splitting procedure on TS. The constraint 

Kpr = A A A A 2:^ = 0 => i^Oloi V -.C) j , 

aj 

is a composable invariant of TS', and (TS',pr) = TS'/Kp^, where for a given s, 
{(s,ai,gi,Ti,ri,Si)}i^i is the set of transitions departing from s. 

Proof. Notice that Kp^ contains all the states but the ones that would be reached 
by firing a transition violating the priority rule. 

(TS',pr) = TS'/Kpr is obtained immediately by comparing syntactically the 
result of restriction by Kp^ with the application of the priority rule pr. 

To prove composability, we show that TS"' 1= inv(Kpr). Let (s,x) be a state 
of TS'" such that Kpj.(.s,x). (2) If there exists an uncontrollable edge ((s,a;),a„, 
(,s',x')) in TS'", then Kp^ cannot contain a constraint of the form s' A 2 = 0 => 
-iCV-iOl g, since is the only transition leading to s' in TS'". Thus, Kpj.(.s',x'). 
(3) If time can progress by t > 0 from (s, x) in TS'", then Kp^(s, x+tb^) obviously 
holds. 

Corollary 3.1. Let (TS,pr) be a timed system with priorities, K be a con- 
trol invariant of (TS,pr), ((TS,pr)/K)' be the result of the splitting procedure 
on (TS,pr)/K, and Kp^ the constraint associated to pr. Then ((TS,pr)/K)' 1= 

inv(Kpr). 

These results say that applying a priority rule can be seen as a restriction of a 
strongly equivalent timed system by a control invariant. Furthermore, whenever 
some other control invariant K is applied to (TS,pr), then (TS,pr) /K still satisfies 
the priority rule pr. In some cases, the property 3.1 holds without applying the 
splitting procedure, as shown in the following examples. 

3.3 Basic Scheduling Algorithms 

Example 3.2. (The fifo Policy). A scheduler follows a first in first out policy 
(fifo) if the CPU is granted to the process that has been waiting for the longest 
time. For non-preemptible processes, fifo is specified by using priorities as follows 

P^fifo — {(fj ^ li:{bj -^0 • 

This means that whenever two processes Pi and Pj are both waiting for the CPU, 
bi has priority over bj if process Pi has been waiting for longer time than process 
Pj, i.e. tj < ti. 

Proposition 3.2. (TS,prfifo) = TS/Kfifo, where 

Kfjfo — A Cj A Xj — 0 ti ^ tj ^ 

is the constraint associated with prufo- Moreover, Kfifo is a composable control 
invariant for TS. (Proof omitted.) 




118 K. Altisen, G. GoBler, and J. Sifakis 



Example 3.3. (The edf Policy). We showed in example 3.1 how to model par- 
tially the edf policy on TS as a priority rule, pr)^f. But this specification has 
to be completed since in case a process Pi arrives (transition a,) exactly when 
the decision to allot the CPU to another process is made, this might be wrong 
depending on whether P, was taken into account or not. This confusion situation 
can be prevented by a priority rule ensuring that the set of waiting processes is 
up to date before any decision is made. Therefore, processes arrival actions a, 
are given priority over bj actions: 

pr edf = prldf U {(U = Tj, {bj -<o • 

Let Kedf be the constraint associated with pr^df- Thus, 

Kgdf — l\i^j Wi A Cj A Xj — 0 Dj tj ^ Di ti 
A Si A 6j A Xj = 0 => ti ^ Ti . 

Proposition 3.3. (TS,predf) = TS/Kgdf, and Kgdf is a composable control in- 
variant. (Proof omitted.) 



Preemptive Fixed-priority Scheduling Preemptive fixed-priority scheduling 
assigns the CPU according to some fixed priority order between the processes to 
be scheduled. If the CPU is free, the highest priority process among the waiting 
processes is scheduled. An arriving process can preempt a running process of 
lower priority. 

Fig. 6 shows the model of a pre- 
emptible process. It has an addi- 
tional control state p (preempted), 
and two more transitions: pr (pre- 
empt) and rs (resume). The timer 
X is stopped in control state p, i.e. 
bp[x] = 0. Everywhere else, timers 
progress. The timer Xpr measures 
the time elapsed since the process 
has been preempted. 

Consider the timed system of n 
processes P\, ■ ■ ■ ,Pn as shown in fig. 6 with the given fixed priorities tti , . . . , 7t„ , 
where tt, < ifj means that Pj has priority over P,. As before, mutual exclusion 
is achieved by application of K^utex- We construct the scheduled system of these 
processes according to the preemptive policy with the priorities tti , . . . , 7t„ as 
follows. 

Process Priorities. Priorities between the processes are specified by the priority 
rule on the CPU allocating actions b and rs: 

PTtt = {(true, {bi -<o bj,bi -<o rsj,rsi -<o bj,rsi -<o rsj}), 

(^j {^i “^0 ^j^rSi Q-J } ) ]- TTi < 7T j • 





A Methodology for the Construction of Scheduled Systems 119 



The first line says that the CPU is granted — by an action hj or rsj — to a process 
Pj that has highest priority among the waiting processes. Here, the constraint 
that specifies when the priority order applies is true, since the priorities are fixed 
and do not depend on timer valuations. The second line guarantees that the set 
of waiting processes is up to date before a new process is scheduled. 

It is easy to show that satisfies the definition of a priority rule. 

Proposition 3.4. Let Kpr„ be the constraint associated topr„. Then, (TS,pr„) 
= TS/Kpr^, and is a composable control invariant of TS. (Proof omitted.) 

Preemption. only specifies the CPU allocation policy, but not the mecha- 
nism preempting a running process, which will be enforced by a further constraint 

f^ptntn — (pi t\ — 0 ^ . 7Tj ^ 7Tj A tj — O) . 

i 

Notice that for given process priorities tti , . . . , 7t„, the term 3j . itj > tt, A t j =0 
is a X-constraint. The constraint means that a process Pi must not take the pvi 
action unless there is a higher priority process Pj that has just arrived. It implies 
that a running process is preempted as soon as a process of higher priority arrives. 
Immediately after that, since the a, actions are eager, the CPU is assigned to a 
waiting process according to pr. 

Kpmtn is a control invariant for TS, thus from corollary 2.1, Kp^tn A is also 
a control invariant of TS. But Kp^tn is not composable, and neither is Kp^tn AK„. 

Example 3.4. (The rms Poliey with Preemption) . The algorithm of preemptive 
rate-monotonic scheduling (rms, [7]) assigns to each process a fixed priority such 
that processes with shorter period have higher priority, i.e., T, > T,- => tt, < itj. 

The invariant K,r can be obtained from pr„ as before. As remarked above, 
Kpmtn A is not composable. However, the rms policy makes the scheduled sys- 
tem (TS,pr„)/Kpmtn nearly deterministic since tt defines a total order. Therefore, 
there is no need to further restrict the system — it is either schedulable or not. 

4 Conclusion 

This work aims at bridging the gap between scheduling theory and timed systems 
specification and analysis. From the general idea that a scheduler is a controller 
of the scheduled processes, we elaborate a methodology for the construction of 
a scheduled system. The methodology is illustrated on periodic processes but it 
can be applied to arbitrary systems of structurally timelock-free processes. 

A contribution of this work is the decomposition of scheduling requirements 
into classes of requirements that can be expressed as safety constraints. We believe 
that the decomposition allows better understanding of scheduling problems and 
clarification of the differences between the two approaches. Scheduling theory 
studies sufficient conditions guaranteeing Kjched for particular scheduling algo- 
rithms characterized by some Kgigp. On the contrary, timed systems specification 




120 K. Altisen, G. GoBler, and J. Sifakis 



and analysis have focused so far on the extraction of behaviors satisfying Kjched 
from a global model. 

This work relates controller synthesis by means of the notion of control in- 
variant, to a methodology for constructing a scheduled system satisfying given 
requirements. The existence of composable control invariants allows the auto- 
matic application of the corresponding synthesis steps. Not surprisingly, finding 
control invariants for schedulability is the hard problem that deserves further 
investigation. Possible directions are the development of specific synthesis algo- 
rithms or the use of constructive correctness techniques as in [2] . 

This work is developed in the framework of a project on real-time systems 
modeling and validation. We have applied the methodology to the description 
of the ceiling protocol [12] and are currently developing tools supporting the 
methodology. 

References 

1. K. Altisen, G. Gofiler, A. Pnueli, J. Sifakis, S. Tripakis, and S. Yovine. A framework 
for scheduler synthesis. In IEEE RTSS 1999 proceedings, 1999. 

2. S. Bornot, G. Gofiler, and J. Sifakis. On the construction of live timed systems. In 
TACAS 2000, volume 1785 oi LNCS. Springer- Verlag, 2000. 

3. S. Bornot and J. Sifakis. On the composition of hybrid systems. In International 
NATO School on “Verification of Digital and Hybrid Systems”, LNCS. Springer 
Verlag, 1997. 

4. P.A. Hsiung, F. Wang, and Y.S. Kuo. Scheduling system verification. In TACAS’99, 
volume 1597 of LNCS. Springer- Verlag, 1999. 

5. Y. Kesten, A. Pnueli, J. Sifakis, and S. Yovine. Integration graphs: A class of 
decidable hybrid systems. Information and Computation, 736, 1992. 

6. H.-H. Kwak, I. Lee, A. Philippou, J.-Y. Choi, and O. Sokolsky. Symbolic schedu- 
lability analysis of real-time systems. In IEEE RTSS 1998 proceedings, 1998. 

7. C.L. Liu and J.W. Layland. Scheduling algorithms for multiprogramming in a 
hard-real-time environment. Journal of the ACM, 20(1), 1973. 

8. Z. Liu and M. Joseph. Specification and verification of fault-tolerance, timing, and 
scheduling. ACM Transactions on Programming Languages and Systems, 21(1):46- 
89, 1999. 

9. O. Maler, A. Pnueli, and J. Sifakis. On the synthesis of discrete controllers for 
timed systems. In STACS’95, volume 900 of LNCS. Springer Verlag, 1995. 

10. P. Niebert and S. Yovine. Computing optimal operation schemes for chemical plants 
in multi-batch mode. In Hybrid Systems, Computation and Control, volume 1790 
of LNCS. Springer Verlag, 2000. 

11. P.J. Ramadge and W.M. Wonham. Supervisory control of a class of discrete event 
systems. Journal of Control and Optimization, 25(1), 1987. 

12. L. Sha, R. Rajkumar, and J. P. Lehoczky. Priority inheritance protocols: An ap- 
proach to real-time synchronization. IEEE Transactions on Computers, 39(9), 1990. 

13. S. Vestal. Modeling and verification of real-time software using extended linear 
hybrid automata. In Fifth NASA Langley Formal Methods Workshop, 2000. 




A Dual Interpretation of “Standard Constraints” in 
Parametric Scheduling 



K. Subramani^ and Ashok Agrawala^ 

^ Department of Computer Science and Electrical Engineering, 
West Virginia University, 

Morgantown, WV, USA 
ksmaniScsee .wvu. edu 
^ Department of Computer Science, 

University of Maryland, 

College Park, MD USA 
agrawalaOcs .umd. edu 



Abstract. Parametric scheduling in real-time systems, in the presence of linear 
relative constraints between the start and execution times of tasks, is a well-studied 
problem. Prior research established the existence of polynomial time algorithms 
for the case when the constraints are restricted to be standard and the execution 
time vectors belong to an axis-parallel hyper-rectangle. In this paper we present 
a polynomial time algorithm for the case when the execution time vectors belong 
to arbitrary convex domains. Our insights into the problem occur primarily as a 
result of studying the dual polytope of the constraint system. 



1 Introduction 

The problem of parametric scheduling for hard real-time systems was introduced in 
[Sak94]. In particular, they considered the scheduling ofprocesses subject to linear rela- 
tive constraints between the start and execution times of tasks. In [GPS95], a polynomial 
time algorithm was presented for the case, where the constraints are “standard” ( defined 
in Section §6 ). In this paper, we present a polynomial time algorithm for parametric 
scheduling when the execution time vectors belong to arbitrary convex domains. Our 
insights into the problem occur primarily as a result of studying the dual polytope of the 
constraint system. 

The rest of this paper is organized as follows: In Section §2, we present the parame- 
tric scheduling model and pose the parametric schedulability query. In the succeeding 
section, viz. Section §3, we discuss the motivation behind the problem and related appro- 
aches in the literature. Section §4 commences our analysis by looking at the complement 
of the parametric scheduling problem. In Section §5, we study the dual of the complement 
problem and apply Farkas’ lemma to derive the termination condition for our algorithm. 
Section §6 presents the “Standard Constraints Modef’. We also discuss the structure of 
the standard constraint matrix and interpret the complement of the parametric sche- 
dulability query in this model. We show that the infeasibility of the input constraint 
system coincides with the existence of a loop having infinite negative cost in a certain 
weighted graph. This implies that a symbolic version of the Bellman-Ford Algorithm for 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 121-133, 2000. 

(c) springer- Verlag Berlin Heidelberg 2000 




122 K. Subramani and A. Agrawala 



the Single-Source-Shortest-Paths problem ( SSSP ) in a network can be used to solve 
the parametric scheduling problem. Section §7 provides such an algorithm, while § 7.1 
discusses its correctness and complexity. We conclude in Section §8 by summarizing 
our results and posing problems for further research. 

2 The Parametric Model 

We are a given a set of ordered non-preemptive tasks J = { Ji, J2, ■ • ■ Jn}, with linear 
constraints imposed on their respective start times {si, S2, • ■ • , Sn} and execution times 
{ei, 62, ... , e„}. The constraint system is expressed in matrix form as : 

A.[s,e]<b, ( 1 ) 



where, 

- s = [si, S2, . . . , s„] is an n— vector of the start times of the tasks, 

- e = [ei, 62, . . . , e„] is an n— vector of the execution time of the tasks, 

- A is a m X 2 . n matrix of rational numbers, 

- b = [&i , 62, . . . , bm] is an m— vector of rational numbers. 

System ( 1 ) is a convex polyhedron in the 2 .n dimensional space, spanned by the start 
time axes {si, S2, . . . , s„} and the execution time axes {ei, 62, . . . , e„}. The execution 
time of the task Cj is not constant, but belongs to the set Ei where Ei is the projection 
of a convex set E on axis ej. The execution times are independent of the start times of 
the tasks; however they may have complex interdependencies among themselves. This 
interdependence is captured by the set E. We regard the execution times as n— vectors 
belonging to the set E. 

The goal is to obtain a start time vector s, that satisifes the constraint system ( 1 ), 
for all execution time vectors belonging to the set E. One way of approaching this 
problem is through Static Scheduling techniques, as discussed in [SAOOb]. However, 
Static Scheduling results in the phenomenon known as loss of schedulability discussed 
below. 

Consider the two task system J = {Ji, J2} with start times {si,S2}, execution 
times {ei € [ 2 , 4 ], 62 G [ 4 , 5 ]} and the following set of constraints: 

- Task Ji must finish before task J2 commences; i.e. Si + ei < S2; 

- Task J2 must commence within 1 unit of J\ finishing; i.e. S2 < si + ei + 1 ; 

A static approach forces the following two constraints: 



— Si + 4 < S2, 

— S2 ^ Si + 2 + 1 S2 ^ Si + 3 



Clearly the resultant system is inconsistent and there is no static solution. Now 
consider the following start time vector assignment. 



Si 




0 


_S2_ 




_Si + 6i_ 




A Dual Interpretation of “Standard Constraints” in Parametric Scheduling 



123 



This assignment clearly satisfies the input set of constraints and is hence a valid 
solution. The key feature of the solution provided by (2) is that the start time of task J 2 is 
no longer an absolute time, but a ( parameterized ) function of the start and execution times 
of task Ji. This phenomenon in which a static scheduler declares a system infeasible in 
the presence of a valid solution ( albeit parameterized ) is termed as loss of schedulability . 

In the parametric scheduling model, we are interested in checking whether an input 
constraint system has a parametric schedule, i.e. a schedule in which the start time of a 
task can depend on the start and execution times of tasks that are sequenced before it. 

Definition 1. A parametric solution of an ordered set of tasks, subject to a set of linear re- 
lative constraints ( expressed by (1)) is a vector s= [si , S 2 , • ■ • , Sn], where si is a rational 
number and each Si,i 1 is a function of the variables {si, ei, S 2 , 62 , ... , Si-i, e^-i}. 
Further, this vector should satisfy the constraint system ( 1 )for all vectors e C E. 

Based on the discussion above, we are in a position to state the parametric schedu- 
lability query: 

3siVei € El 3 s 2 Vc 2 G E 2 , ■ ■ ■ 3s„Ve„ G A.[s,e] < b ? (3) 

The elimination strategies used in [GPS95] establish that a parametric schedule need 
only have linear functions. 

3 Motivation and Related Work 

Our investigations have been motivated by two orthogonal concerns viz. real-time ope- 
rating systems and real-time applications. 

In real-time operating systems such as Mamti [LTCA89,MAT90,MKAT92] and 
MARS [DRSK89], the interaction of processes is constrained through linear relations- 
hips between their start and execution times. Real-time specification languages such as 
the Mamti Programming Language ( MPL ) [SdSA94] permit programmer constracts 
such as: 

- within 10 ms; do 

Perform Task 1 od 

- Perform Task 1; 

Delay at most 17 ms; 

Perform Task 2 

These constmcts are easily transformed into linear constraints between the start 
and execution times of the tasks. For instance, the first constmct can be expressed as: 
Si > 10, while the second constmct is captured through: S 2 > /i 3- 17. Note that /i 
is the finish time of task 1 and since we are dealing with non-preemptive tasks, we can 
write fi = Si + eifii, where /i denotes the finish time of task i. 

The automation of machining operations [Y.K80,Kor83,SE87,SK90] provides a rich 
source of problems in which execution time vectors belong to convex domains. Con- 
sider the contouring system described in [TSYT97], in which the task is to machine a 
workpiece through cutting axes. In general, there are multiple axes of motion that move 




124 K. Subramani and A. Agrawala 



with different veloeities. In a two axis system, a typieal requirement would be to con- 
strain the sum of the velocities of the axes to exceed a certain quantity. This is captured 
through:ei -I- 62 > a. 

Real-time database applications involve the scheduling of transactions and the exe- 
cution of these transactions is constrained through linear relationships [BFW97]. 

Deterministic sequencing and scheduling have been studied extensively in the li- 
terature [BS74,DL78,Cof76]. Our focus is on a particular scheduling model viz. the 
parametric scheduling model proposed in [Sak94]. In [GPS95] a polynomial time algo- 
rithm is presented for the standard constraints case, in which the execution time vectors 
belong to an axis-parallel hyper-rectangle. They use the Fourier-Motzkin ( FM ) eli- 
mination method [Sch87] to successively eliminate the variables in query (3). The FM 
algorithm takes exponential time in the worst case; in the case where the constraints 
are standard they show that they can prevent the exponential increase in the number 
of constraints. Hochbaum, et. al. [HN94] have shown that it is possible to implement 
FM elimination in strongly polynomial time for network constraints. In a previous paper 
[SAOOa], we showed that a restricted version of the parametric scheduling problem is 
NP-complete, when the constraints are arbitrary. It was also established that it is suffi- 
cient to determine whether a system is parametrically schedulable; explicit construction 
of the parametric functions is not necessary. 

In this paper, we extend the results in [GPS95] to provide polynomial time algorithms 
for standard constraints in arbitrary convex domains 

4 Complement of Parametric Scheduling 

We commence our analysis by looking at the complement of the parametric scheduling 
query (3). Observe that a query is true iff its complement is false. 

The complement of query (3) is: 

“■( 3siVei G ifi3s2Ve2 G 7^2, • ■ • 3s„Ve„ G En A[s, e] < b ? ), (4) 

which gives 

Vsi3ei G ifVs23e2 G E, . . .Vs„3e„ G E A[s, e] ^ b ? 

where A.x ^ b means that the polyhedral set {x : A.x < b} is empty. 

As observed in [SAOOa], when we restrict ourselves to the case in which the execution 
times are independent of the start times of the tasks, we can restate the query above as: 

3ci G Ei3c2 G E 2 , . . . 3e„ G £’„VsiVs2) ■ • ■ Vs„ A[s, e] ^ b ? (5) 

which implies 

3e= [ei,e2,...,e„]VsiVs2 ,...Vs„ A[s,e]^b ? (6) 

Query (6) basically asks whether there exists an execution time vector e = [e'l , 62 , . . . , 
e^] G E such that the linear system resulting from substituting these execution times in 
A.[s, e] < b is infeasible, i.e. as shown in [SAOOa], (6) asks whether the polyhedral set 




A Dual Interpretation of “Standard Constraints” in Parametric Scheduling 



125 



s : A.[s.^ < b|e = [e'l, e' 2 , . . . , e'„] (7) 

is empty. 

For the rest of the paper, we focus on finding such a witness execution time vector; 
if we succeed in finding such a vector, it means that the input system does not have a 
parametric schedule. On the other hand, if we can definitely say that no such execu- 
tion vector exists within the convex domain E, then query (3) can be answered in the 
affirmative and the input system does have a parametric schedule. 

5 The Parametric Dual 

We rewrite the constraint system (1) in the form: 

G.s<b-B.e ( 8 ) 



where. 



A . [s , g] — G . s -p B . G 



Accordingly, query (6) gives 



3g = [ei, 62 , . . . ,e„]VsiVs 2 , . . . Vs„ G.s ^ b - B.g ? 



(9) 



Note that ( b — B.g ) is an m— vector, with each element being an affine function in 
the Ci variables. We set g = (b — B.g), so that we can rewrite query (9) as 

3g= [ 61 , 62 ,..., e„]VsiVs 2 ,...Vs„ G.s ^ g? (10) 

The matrix G will henceforth be referred to as the constraint matrix. Note that G is 
am X n rational matrix. 

In order to find an execution time vector, which serves as a witness to the infeasibility 
of the input constraint system, we study the dual of the complement problem. The 
following lemma called Farkas’ lemma [NW88,Sch87] is crucial to understanding and 
analyzing the dual. 

Lemma 1. Either {x € 5ft" : A.x < b} ^ or ( exclusively ) 3y G 5ft™, such that, 
y^A > 0 and y^.b = — c». 



Proof : See [Sch87,PS82,NW88]. □ 

The lemma is interpreted as follows: Either the primal system viz. |A.[x] < b,x 
> 0} is feasible, in which case the associated polyhedron is non-empty or (exclusively) 
the vector b lies in the polar cone of the dual space viz. {y^ . A > 0, y > 0} ( See Figure 

(I))- ^ ^ 

In the latter case, the function y^.b is unbounded below and its minimum is — 00 . 
For a geometric interpretation of the lemma, refer [PS82]. 

Query (10) requires the system G.s < g to be infeasible for a particular g G E. 
Farkas’ lemma assures us that this is possible only if 3y' G such that 

y' .G > 0, y' .(b — B.g) = —00 



( 11 ) 




126 K. Subramani and A. Agrawala 




Fig. 1. Farkas’ Lemma 



which implies that 



G'^.y' > d, y' .(b — B.e) = — oo. (12) 

Equation (12) is interpreted algorithmically in the following way: 



Let z be the minimum of the bilinear form y' . (b — B .e) over the two convex bodies 
{y : y > 0,G'^.y > 0} and E. If z = —oo, the input system of constraints does 
not have a parametric schedule. 



6 “Standard Constraints” Model 

As discussed in [GPS95,Sch87] the Fourier-Motzkin elimination method suffers from 
the curse of dimensionality i.e. it is an exponential time method in the worst-case. 
[Sak94] shows that for an important subset of constraints, viz. Standard Constraints, 
the elimination method runs in polynomial time. As described in [Sak94], 

Definition 2. A standard constraint involves the start times of at most two tasks Ji and 
Jk, such that exactly one of Si or Sj appears on one side of the < relation. Further the 
coefficients of all start and execution variables are unity. 

For example, the following set of constraints are standard: 

1 . ^ Sj 4“ 2, 

2. Si -\- Ci Sj, 





A Dual Interpretation of “Standard Constraints” in Parametric Scheduling 



127 



3. Sj + ej < Sj + Cj ■ + 2 

The constraint Si + sj <2 is not standard, because both Si and Sj appear on the same side 
of the relational operator <. Absolute constraints i.e. constraints in which the start time of 
a task is constrained by an absolute value ( e.g. si > 5 ) are also permitted and considered 
standard. In order to make the treatment of constraints ( absolute and relative ) uniform, 
we introduce an additional task Jq with start time sq and execution time eo = 0. Further 
we impose the constraint sq + eg < si. Absolute constraints are modified as follows: 

- Constraints of the form Si < a are replaced by Si — so < a; 

- Constraints of the form Si> a are replaced by so ~ Sj < 

For the rest of the discussion, we assume that the matrix G in (12) has been altered 
to reflect the above changes. Accordingly, has n + 1 rows 0 through n and m + 1 
columns ( m of the initial constraints and the additional constraint between sq and si. ) 



6.1 Structure of the Transpose of the Standard Constraint Matrix 

When the constraints are standard, the transpose of the constraint matrix ( i.e. G^ in 
(12)) has the following structure: 

1. All entries belong to the set {0, 1, —1}. 

2. There are exactly 2 non-zero entries in any column; one of the entries is 1 and the 
other entry is —1. 

In this case, we can show that the problem: Is z = y^.b = — oo ?, subject to : 

G'^y = g, y>0 (13) 



where 

- G^ is a (n + 1) x (m -F 1) rational matrix, with the structure discussed above, 

- y is a m + 1— vector, 

- b is a rational m + 1— vector; 6o = 0. 

- g is a rational n + 1— vector. 

has a m/n-coit^ow interpretation in a constraint network. ' The network G' =< V',E' > 
corresponding to the constraint system G^.y = g is constructed as follows: 

1. Avertexforvj for the each row j, j = 0, . . . n of G^, giving a total of n + 1 vertices. 
Note that Vi corresponds to row Gi i.e. task J^. 

2. Associated with each vertex vj is a supply equal to pj] Set go = 0- 

3. An edge for each column of i,i = 0, ... m of G^ giving a total of m + 1 
edges. Let Va denote the vertex corresponding to the +1 entry and Vb denote vertex 
corresponding to the —1 entry. Direct the edge from the vertex Va to Vb- 

4. Associated with edge is cost bi, where bi is the coefficient of yp, 

* These constraints are a subset of monotone constraints, in which only relations of the form: 
a.xi — b.X 2 < c, a, b > 0 are allowed. 




128 K. Subramani and A. Agrawala 



5- i/i(> 0), Vi = 0, . . . , m represents the flow on edge The flow on the edge fo — 
i.e. j/o does not eontribute to the total eost as the cost on this edge is 0. 

The vertex vq is the source of this network. Each constraint is now a mass balance 
condition i.e. it states that the net flow into node Vi which is the difference between the 
total flow into Vi and the total flow out of Vi must equal the supply at Vi. z = y^.b 
represents the cost of the flow. 

Let us analyze the case where all vertices two special cases, which directly bear upon 
our scheduling problem: 

1. g = 0 i.e. the supply at all vertices is zero. 

Lemma 2. In this case the condition z = —oo is possible only iff there is a negative 
cost loop in the network. 

Proof : Clearly, if there is a negative cost loop, we can pump flow in that loop 
decreasing the cost arbitrarily, while meeting the mass balance constraints. 

Now assume that z = —oo. This is possible only if the flow vector y is unbounded 
in some of its elements. Let us pick some element yk that is unbounded i.e. yk = 
+(X). Thus there is an infinite flow on edge e^. In order to satisfy the zero supply 
requirement at the vertices corresponding to its end-points, Ck must belong to a 
closed loop and all the edges in that loop have the same flow equal to +c». Since 
z = —oo, it follows that the cost around that loop is negative. □ 

2. gx = a(a > 0); gt = 0,i = 2, . . . n. In this case the first node has a supply that 
could be non-zero. This is now a Single-Source Shortest Path Problem with vertex 
vi being the source. Using arguments identical to the case above, it is clear that that 
0 = — oo coincides with the existence of a negative cost cycle i.e. the shortest path 
from the source vq to any vertex on this cycle is of length — oo. 

Our dual system (12) though is in the form G^.y > 0, y > 0. Before we apply 
the flow-related concepts and results derived above, the system needs to be converted 
into equality form. We use the Complementary Slackness property [Sch87,PS82] to 
aid us in this conversion. Observe that in the primal system, the start time variables 
are strictly ordered i.e. we have si < Si+i,i = 0, ... n — 1. We impose si > e to 
simplify the analysis. Thus in any solution ( including the optimal ), we must have 
Si > 0, Vz = 1, . . . n. According to the Complementary Slackness property, if the primal 
variable is non-zero at optimality, then the corresponding constraint in the dual must be 
met with equality. Thus, all the constraints in the system G^.y > 0, except the first one, 
are met with equality, which is exactly what we need. Hence, we can rewrite condition 
(12) for infeasibility in the primal as: 

3yGK™+^ G'^.y' = [a,0, . . . ,0]^ y'’^.(b - B.e) = -oo. (14) 

Thus our problem is equivalent to the SSSP problem as discussed in case (2) of the 
above analysis. There is one major difference, viz. in our case the edge costs are symbolic 
( e.g. 6i , Cl — 62, etc. ) and not rational numbers. However, if we can And values e' G E for 
the 6i variables, then our techniques and results still hold. We now require an algorithm 
that detects symbolic negative cost cycles in the constraint graph corresponding to the 
given constraint system. We provide such an algorithm in the following section. 




A Dual Interpretation of “Standard Constraints” in Parametric Scheduling 



129 



7 The Symbolic Bellman-Ford Algorithm 

Algorithm (7.1) together with procedures (7.2) and (7.3) represents the Symbolic Bell- 
man-Ford Algorithm. The key modification to the algorithm in [CLR92] is the addition 
of procedure 7.3. In the case, where the edge weights are rational numbers, it is trivial 
to check whether d[n] exceeds d[u] + w{u, v\. 

The input to the algorithm is a graph G" =< V',E' >, with F' denoting the vertex set 
and E' denoting the edge set. The weights on the edges are parameterized linear functions 
in the execution times as discussed above. The function Initialize-Single-Source 
sets the source s to be at a distance of 0 from itself and all other vertices at a distance of 
oo from the source. A detailed exposition of the Bellman-Ford Algorithm is presented 
in [CLR92]. Let <5[no, Wi], d[vi] denote the length of the shortest path from vq to vertex 
V and the current estimate of the shortest path respectively. 



Function Symbolic-Bellman-Ford (G', w, s) 

1 : Initialize-Single-Source 
2: for(i^lto|L'(G)|-l)do 
3: for ( each edge (ti, v) € E' [G] ) do 

4: Symbolic-Relax(m, v, w) 

5: end for 

6: end for 
7: 

8: for ( each edge (u, v) G E[G] ) do 
9: if ( d[w] >sym d[u] + w{u,v) ) then 

10: return(false) 

1 1 : end if 

12: end for 
13: 

14: retnrn(trne) 

Algorithm 7.1: Symbolic Bellman Ford 



Procednre Symbolic-Relax(u, v, w) 
if ( d{v] >sym d[u] + w(u, v ) ) then 
d[v] = d[ti] -I- w{u, v) 

end if 

Algorithm 7.2: Symbolic-Relax 



Assuming that all vertices are reachable from the source ( a valid assumption in our 
case ), a return value of true means that there is a finite shortest path from the source 
to every other vertex. Likewise, the detection of a negative cost cycle ( indicating that 
certain vertices are at a distance of — oo from the source ), causes the value false to be 






130 K. Subramani and A. Agrawala 



Function Symbolic >(u, v, w) 

if ( miriE' -(dlw] — d[u] — w{u, v)) < 0) then 
return(true ) 
else 

return(false ) 
end if 

Algorithm 7.3: Implementation of >aym 



returned. When dealing with rational numbers, all the above operations are relatively 
straightforward. In our case, the weights on the edges are no longer rational numbers, 
but parameterized linear forms in the variables, as indicated above. The algorithm 
implementing Symbolic > is a convex minimization algorithm [PS82,HuL93]. 

7.1 Analysis - Correctness and Complexity 

The correctness of the algorithm follows from the correctness of the Bellman-Ford 
algorithm [CLR92]. The following two cases arise: 

1. There is no point e' G E such that substituting e' on the edge costs results in a 
negative cost cycle. 

Claim: 71 Algorithm (7.1) returns true. 

Proof : Observe that in the absence of a witness vector e' G E, the shortest path 
from vq to every vertex is finite. Using the inductive technique from [CLR92], it is 
clear that after |E'(G)| — 1 iterations of the for loop in Step 2 of Algorithm (7.1) 
the distance of each vertex has converged to its true shortest path from the source. 
Consequently the test in the succeeding for loop fails and the value true is returned. 
□ 

2. There exists a point e' = [e'^, e^, . . . , e'f\ G E such that substituting e' on the edge 
costs results in a negative cost cycle. 

Claim: 72 Algorithm (7.1) returns false. 

Proof : Once again, we use the same technique as in [ CLR92 ]. □ 

The time taken by the algorithm is dominated by the 0{nfi) loop represented by Steps 
2 — 6 of Algorithm 7.1. Each call to Symbolic-Relax takes time 0{C) where C is the 
time taken by the fastest convex programming algorithm [HuL93]. Accordingly, the total 
time taken by Symbolic-Bellman-Ford is 0{n^.C). 

1.2 Example 

Let us apply our techniques to the following problem. 

We have four tasks { Ji, T 2 , T 3 , J 4 } with execution times {ei G [4, 8 ], 62 G [0, 11], 
63 G [10.13], 64 G [3, 9]} and start times {si, S 2 , S 3 , S 4 } constrained through: 





A Dual Interpretation of “Standard Constraints” in Parametric Scheduling 



131 



(a) Task J 4 finishes before time 56; S 4 + 64 < 56 

(b) Task J 4 finishes within 12 units of J 3 ;s 4 + 64 < S 3 + 63 + 12 

(c) Task J 4 starts no earlier than 18 units of T 2 completing: S 2 + 62 + 18 < S 4 

(d) Task J 3 finishes within 31 units of Ji completing: S 3 + 63 < si + ei + 31 

(e) Implicit are the ordering constraints: 

0 < Si, Si + 6 i < S 2 , S 2 + 62 < S 3 , S 3 + 63 < S 4 

From (3), the parametric schedulability query is: 

3siVei G [ 4 , 8 ] 3 s 2 Vc 2 G [ 6 ,ll] 3 s 3 Ve 3 G [10, 13]3s4Ve4 G [3,9]{(a), (&), (c), (d), (e)} 

(15) 

We construct the graph in Figure (2) as per the discussion in Section §6.1. 



-18-e2 




Fig. 2. Constraint Graph Corresponding to Example 



In this case, the convex domain is the axis-parallel hyper-rectangle E = [4, 8] x 
[6, 11] X plO, 13] X [3, 9]. We provide the graph M =< V , E' > and E as the input to 
Algorithm (7.1). The tables below (1-2) detail the iterations of the algorithm. 

At the end of the 2”^^ iteration, the shortest path values converge and after applying 
Steps (8-12) of Algorithm (7.1), we conclude that there is no negative cost loop in the 
graph. 



8 Concluding Remarks 

In this paper, we set out to address the following question: Are there polynomial time 
algorithms for execution times in domains other than axis-parallel hyper-rectangles ? 




132 K. Subramani and A. Agrawala 




Table 1. Iteration 1 Table 2. Iteration 2 ( Final iteration ) 



We answered this question by providing a polynomial algorithm for the case when 
the execution times belong to arbitrary convex domains. Our algorithms are simple and 
easy-to-implement extensions of existing algorithms for network problems. Our work 
is currently being implemented in the Maruti Operating System ^ [STAOO]. 

It would be interesting to see if the techniques presented in this paper can be extended 
to a wider class of constraints in real-time scheduling. 

References 

BFW97. Azer Bestavros and Victor Fay- Wolfe, editors. Real-Time Database and Information 
Systems, Research Advances . Kluwer Academic Publishers, 1997. 

BS74. K. R. Baker and Z. Su. Sequencing with Due-Date and Early Start Times to Minimize 
Maximum Tardiness. Naval Res. Log. Quart., 21 '.111-11 6 , 1974. 

CLR92. T. FI. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to algorithms. MIT 
Press and McGraw-FIill Book Company, 6th edition, 1992. 

Cof76. E. G. Coffman. Computer and Job-Shop Scheduling Theory, Ed. Wiley, New York, 
1976. 

DL78. S. K. Dhall and C. L. Liu. On a real-time scheduling problem. Operations Research, 
26(1): 127-140, Jan. 1978. 

DRSK89. A. Damm, J. Reisinger, W. Schwabl, and H. Kopetz. The Real-Time Operating System 
of MARS. ACM Special Interest Group on Operating Systems, 23(3):141-157, July 
1989. 

GPS95. R. Gerber, W. Pugh, and M. Saksena. Parametric Dispatching of Hard Real-Time 
Tasks. IEEE Transactions on Computers, 1995. 

HN94. Dorit S. Hochbaum and Joseph (Seffi) Naor. Simple and fast algorithms for linear 
and integer programs with two variables per inequality. SIAM Journal on Computing, 
23(6): 1 179-1 192, December 1994. 

HuL93. J. B. Hiriart-urruty and C. Lemarechal. Convex Analysis and Minimization Algorithms. 
Springer- Verlag, 1993. 

Kor83. Y. Koren. Computer Control of Manufacturing Systems. McGraw-Hill, New York, 
1983. 

^ Maruti is the registered trademark of the Maruti Real-Time Operating System, developed at 
the University of Maryland, College Park; http://www.cs.umd.edu/projects/maruti 






A Dual Interpretation of “Standard Constraints” in Parametric Scheduling 



133 



LTCA89. 

MAT90. 

MKAT92. 

NW88. 

PS82. 

SAOOa. 

SAOOb. 

Sak94. 

Sch87. 

SdSA94. 

SE87. 

SK90. 

STAOO. 

TSYT97. 

Y.K80. 



S. T. Levi, S. K. Tripathi, S. D. Carson, and A. K. Agrawala. The Maruti Hard 
Real-Time Operating System. ACM Special Interest Group on Operating Systems, 
23(3):90-106, July 1989. 

D. Mosse, Ashok K. Agrawala, and Satish K. Tripathi. Maruti a hard real-time operating 
system. In Second IEEE Workshop on Experimental Distributed Systems, pages 29-34 . 
IEEE, 1990. 

D. Mosse, Keng-Tai Ko, Ashok K. Agrawala, and Satish K. Tripathi. Maruti: An 
Environment for Hard Real-Time Applications. In Ashok K. Agrawala, Karen D. 
Gordon, and Phillip Hwang, editors, Maruti OS, pages 75-85. lOS Press, 1992. 

G. L. Nemhauser and L. A. Wolsey. Integer and Combinatorial Optimization. John 
Wiley & Sons, New York, 1988. 

C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization. Prentice Hall, 
1982. 

K. Subramani and A. K. Agrawala. The parametric polytope and its applications 
to a scheduling problem. Technical Report CS-TR-4116, University of Maryland, 
College Park, Department of Computer Science, March 2000. Submitted to the 7*^ 
International Conference on High Performance Computing ( HIPC ) 2000. 

K. Subramani and A. K. Agrawala. The static polytope and its applications to a 
scheduling problem. 3^“^ IEEE Workshop on Factory Communications, September 
2000 . 

Manas Saksena. Parametric Scheduling in Hard Real-Time Systems. PhD thesis. 
University of Maryland, College Park, June 1994. 

Alexander Schrijver. Theory of Linear and Integer Programming. John Wiley and 
Sons, New York, 1987. 

M. Saksena, J. da Silva, and A. Agrawala. Design and Implementation of Maruti-II. In 
Sang Son, editor. Principles of Real-Time Systems. Prentice Hall, 1994. Also available 
as CS-TR-2845, University of Maryland. 

K. Shin and M. Epstein. Intertask communication in an integrated multi-robot system. 
IEEE Journal of Robotics and Automation, 1987. 

K. Srinivasan and P.K. Kulkarni. Cross-coupled control of biaxial feed drive mecha- 
nisms. ASME Journal of Dynamic Systems, Measurement and Control, 1 12:225-232, 
1990. 

K. Subramani, Bao Trinh, and A. K. Agrawala. Implementation of static and parametric 
schedulers in maruti. Manuscript in Preparation, March 2000. 

M. Tayara, Nandit Soparkar, John Yook, and Dawn Tilbury. Real-time data and co- 
ordination control for reconfigurable manufacturing systems. In Azer Bestavros and 
Victor Fay- Wolfe, editors, Real-Time Database and Information Systems, Research 
Advances, pages 23^8. Kluwer Academic Publishers, 1997. 

YKoren. Cross-coupled biaxial computer control for manufacturing systems. ASME 
Journal of Dynamic Systems, Measurement and Control, 102:265-272, 1980. 




Co- Simulation of Hybrid Systems 
S ignal- S imulink 



Stephane Tudoret^, Simin Nadjm-Tehrani^*, Albert Benveniste^, and 
Jan-Erik Strdmberg^ 



^ Dept, of Computer & Information Science, Linkoping University, 
S-581 83 Linkoping, Sweden, e-mail: simin@ida.liu.se 
^ IRISA-INRIA, Campus de Beaulieu, Rennes, France 
® DST Control AB, Mjardevi Science Park, Linkoping, Sweden 



Abstract. This article presents an approach to simulating hybrid sy- 
stems. We show how a discrete controller that controls a continuous envi- 
ronment can be co-simulated with the environment (plant) using C-code 
generated automatically from mathematical models. This approach uses 
Signal with Simulink to model complex hybrid systems. The choices 
are motivated by the fact that Signal is a powerful tool for modelling 
complex discrete behaviours and Simulink is well-suited to deal with 
continuous dynamics. In particular, progress in formal analysis of Sig- 
nal programs and the common availability of the Simulink tool makes 
these an interesting choice for combination. We present various alternati- 
ves for implementing communication between the underlying sub-models. 
Finally, we present interesting scenarios in the co-simulation of a discrete 
controller with its environment: a non-linear siphon pump originally de- 
signed by the Swedish engineer Christofer Polhem in 1697. 



1 Introduction 

The use of software and embedded electronics in many control applications leads 
to higher demands on analysis of system properties due to added complexity. 
Simple controller blocks in Matlab are increasingly replaced by large programs 
with discrete mode changes realising non-linear, hierarchical control and super- 
vision. The analysis of these design structures benefits from modelling environ- 
ments using languages with formal semantics - for example, finite state machines 
(e.g. Statecharts [11], Esterel [5]), or clocked data flows (e.g. Lustre [9], 
Signal [8]). 

These (discrete-time) languages and associated tools provide support in pro- 
gramming the controller in many ways. To begin with, they provide an archi- 
tectural view of the program in terms of hierarchical state machines or block 

* This work was supported by the Esprit LTR research project SYRF. The second 
author was also supported by the Swedish research council for engineering sciences 
(TFR). 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 134-151, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Co-Simulation of Hybrid Systems: Signal-Simulink 135 



diagrams. In recent years, certain modelling environments for continuous sy- 
stems have also been augmented with versions inspired by these languages, e.g. 
Matlab Stateflow [20] and MatrixX [12] discrete-time superblocks. 

In addition, formal semantics for the underlying languages allows the con- 
troller design to be formally analysed. Constructive semantics in Esterel and 
clock calculi in Lustre and Signal, enable formal analysis directly at compila- 
tion stage [4] . Properties otherwise checked by formal verification at later stages 
of development [6], e.g. causal consistency or determinism, are checked much 
earlier. Also, results of these analyses are used at later stages of development - 
in particular, for automatic code generation (code optimisation) and code distri- 
bution [2,3,7,13]. Note that these types of formal analysis of a discrete controller 
are so far not supported in the traditional modelling environments (e.g. Matlab 
and MatrixX). 

However, properties at the system level still have to be addressed by the 
analysis of the closed loop system. Formal verification of hybrid models is gene- 
rating new techniques for this purpose. Restrictions on the class of differential 
and algebraic equations (DAE) for the plant or approximations on the model to 
get decidability are active areas of research [10,26,14]. 

In this paper we explore another direction aimed at applications where the 
DAE plant model is directly used for controller testing within the engineering 
design process. That is, we study the question of co-simulation. Formal veri- 
fication can be a complement to, or make use of the knowledge obtained by 
integrated simulation environments. In this set-up the plant is specified as a set 
of DAE and the controller specified in a high level design language. The con- 
troller is subjected to formal verification supported by the discrete modelling 
tools, and the closed loop system is analysed by co-simulation. To this end, we 
propose a framework in which Signal programs and Matlab-Simulink [22] 
models can be co-simulated using automatically generated C-code. We present 
the application of the framework to a non-trivial example suggested earlier [27, 
28]. 

2 Introduction to SIGNAL 

Signal is a data-flow style synchronous language specially suited for signal 
processing and control applications [1,16,18]. A Signal program manipulates 
signals, which are unbounded series of typed values (logical, integer...), with an 
associated clock denoting the set of instants when values are present. Signals of 
a special kind called event characterised only by their clock i.e., their presence 
(when they occur, they give the Boolean value true). Given a signal A, its clock 
is obtained by the language expression event X, resulting in the event that is 
present simultaneously with X. To constrain signals X and Y to be synchronous, 
the Signal language provides the operation: synchro A, Y. The absence of a 
signal is noted T. 




136 



S. Tudoret et al. 



2.1 The Kernel of Signal 

Signal is built around a small kernel comprising five basic operators (functions, 
delay, selection, deterministic merge, and parallel composition) . These operators 
allow to specify in an equational style the relations between signals, i.e., between 
their values and between their clocks. 

Functions (e.g., addition, multiplication, conjunction, ...) are defined on the type 
of the language. For example, the Boolean negation of a signal E is not E. 

X := f(Xl,X2,--- ,Xn) 

The signals X ,X1,X2,- ■ ■ ,Xn must all be present at the same time, so they are 
constrained to have the same clock. 

Delay gives the previous value ZX of a signal X, with initial value VO: 

ZX := X$1 init VO 

Selection of a signal Y is possible according to a Boolean condition C: 

X :=Y when C 

The clock of signal X is the intersection of the clock of Y and the clock of oc- 
currences of C at the value true. When X is present, its value is that of Y . 

y : _L 1 2 3 4 _L 5 
C':t_Ltf_Ltt 
X :=Y when C':_L_L2_L_L_L5 

Deterministic merge defines the union of two signals of the same type, with a 
priority on the first one if both are present simultaneously: 

X := Y default Z 

The clock of signal X is the union of that of Y and of that Z. The value of X 
is the value of Y when Y is present, or else the value of Z if Z is present and Y 
is not. 



r : 1 _L 2 3 _L 4 5 
Z : _L 10 20 _L 30 _L 50 
X :=Y default Z : 1 10 2 3 30 4 5 

Parallel composition of processes is made by the associative and commutative 
operator “|”, denoting the union of the equation systems. In Signal, the parallel 
composition of PI and P2 is written: 

(I I ^2 I) 

Each equation from Signal is like an elementary process. Parallel compo- 
sition of processes is made by the associative and commutative operator “|”, 
denoting the union of the equation systems. In Signal, the parallel composition 
of PI and P2 is denoted: (| PI | P2 |). 




Co-Simulation of Hybrid Systems: Signal-Simulink 137 



2.2 Tools 

All the different tools which make up the Signal environment use only one tree- 
like representation of programs, thus we can go from one tool to another without 
using an intermediate data structure. The principal tools are the compiler which 
allows to translate Signal programs into C, the graphical interface and, for the 
classic temporal logic specifications, the verification tool SiGALi. 

The most interesting tool from a formal verification point of view is the Si- 
GALi tool supporting the formal calculus. It contains a verification and controller 
synthesis tool-box [17,15], and facilitates proving correctness of the dynamical 
behaviour of a system with respect to a temporal logic specification. 

The equational nature of the Signal language leads to the use of polynomial 
dynamical equation systems (PDS) over a formal model of program 

behaviour. Polynomial functions over provides us with efficient algo- 

rithms to represent these functions and polynomial equations. Hence, instead 
of enumerating the elements of sets and manipulating them explicitly, this ap- 
proach manipulates the polynomial functions characterising their set. This way, 
various properties can be efficiently proved on polynomial dynamical systems. 
The same formalism can also be efficiently used for solving the supervisory con- 
trol problem. 



3 Introduction to SIMULINK 

SiMULiNK is the part of the Matlab toolbox for modelling, simulating, and 
analysing dynamical systems. It provides several solvers for the simulation of 
numeric integration of sets of Ordinary Differential Equations (ODEs). As Sig- 
nal, SiMULiNK allows stand-alone generation in four steps, i.e. specify a model, 
generate C code, generate makefile and generate stand-alone program. For code 
generation, however, currently it is not possible to use variable-step solvers to 
build the stand-alone program. Thus, we had to use the fixed step size solvers, 
and therefore, the step size needs to be set accurately. 

SiMULiNK Real-Time Workshop (RTW) [19] is the setting for automatic C 
code generation from Simulink block diagrams via a Target Language Com- 
piler (TLC) [21]. By default^, the RTW gives mainly four C files : <Model>.c, 
<Model> . h, <Model> . prm and <Model> . reg. The function of these files in stand- 
alone simulation is fully described in [29]. Figure 1 summarises the architecture 
of the stand-alone code generation with Simulink. The makefile is automati- 
cally made from a template makefile (for example grt_unix.tmf is the generic 
real-time template makefile for UNIX). 

By default, the run of a stand-alone program provides a Matlab data file 
(<Model>.mat). Before building of the stand-alone program, it is possible to 
select which data we want to include in the Matlab file. Then, one can use 
Matlab to plot the result. 

^ It is possible to customise the C code generated from any Simulink model with the 
TLC which is a tool that is included in RTW. 




138 



S. Tudoret et al. 



1) Model building 



grtunix.tmf 



2) C code generating 



4) Stand-alone building 

grt main.c ^ 




Download to target hardware 



3) Makefile creating 



model. mk 



Fig. 1. Automatic code-generation within the Real-Time Workshop architecture 



4 Modelling Multi-mode Hybrid Systems 

Signal and Simulink have both a data-flow oriented style. Here we present a 
mathematical framework in which both Signal and Simulink sub-models can 
be plugged in to form a hybrid system. 

Hybrid systems can be mathematically represented as follows: 



X-i fi{Q: 5 G IR. ^ Q G Q (1) 

Vi = hi{q,Xi,Ui) (2) 

6i = Si{q,Xi,Ui,yi) (3) 

n = e.l{e,^e, } (4) 

q' = T{q,r) , T = {Ti,i= 1,.. . ,1) (5) 



Where: 








Co-Simulation of Hybrid Systems: Signal-Simulink 139 



(1) : / indexes a collection of continuous time subsystems (CTS), 

<7 G (5 is the discrete state, where Q is a finite alphabet, 

Xi G M"* is the vector continuous state of the zth CTS, 

Ui G K™* is the vector continuous control of the zth CTS, 
di G ]R°* is the vector continuous disturbance of the zth CTS. 

(2) : Di G M*’* is the vector continuous output of the zth CTS, 

(3) : Ci G B’’* where B is the Boolean domain. Thus at each instant an r-tuple of 

predicates depending on the current values of {q,Xi,Ui,yi) is evaluated. 
Examples are x^ > 0 where superscript j refers to the /cth component of Xi, 
if Xi = (x^*, . . . ,x”0, or g{q,Xi,Ui,yi) > 0 for g{q , ., ., .) : 
and so on. 

(4) : ei_{t) denotes the left limit of at t, i.e., the limit of ei{s) for s < t,s Z' t. 

Assume that ef’ (t) yf means that the kth predicate changes its status 
at instant t; this generates an event The marked events together form 
a vector event Ti (and the latter form the vector event r). Thus trajectories 
6i are piecewise constant. 

(5) : q, q' are the current and next discrete automaton state. 

We use an architectural decomposition earlier used for several case stu- 
dies [25]. Here we use it to discuss the way the communication between the 
two sub-models can be implemented for co-simulation. 

In the generic architecture shown in Figure 2, the Plant (P) is the physical 
environment under control. The inputs u, the outputs y and the disturbances 
d all have continuous domains. The Characterizer (C) is the interface between 
the continuous plant and the discrete selector, including A/D converters. The 
Selector (S) is the purely discrete part of the controller - with discrete, input, 
state and output. The Effector (E) is the interface between the discrete selector 
commands and the continuous physical variables including actuators. 

This architecture is a good starting point for hybrid system modelling. It 
remains to decide: 

— How to map the mathematical representation above on the architecture? 

— Which parts should be modelled in Signal and which parts in Simulink? 

— How the Signal part should be activated? Which mechanism should be used 
including A/D convertors. 




Fig. 2. General hybrid system architecture. Solid (dotted) arrows represent continuous 
(discrete) flows 





140 



S. Tudoret et al. 



From our introductory remarks it should be fairly obvious that selector mo- 
delling is best done in Signal, and that Simulink is best for modelling the 
plant. Thus, it remains to determine how to implement the interface between 
the two, or rather, where and how to model the characterizer and the effector. 
Next, we need to determine how to generate runs of the hybrid system. 

In this paper we adopt the scheme whereby the main module of the Simulink 
model is the master and the Signal automaton is one of the many processes run 
in a pseudo-parallel fashion. This is realisable using the translation scheme in 
RTW. The Simulink model then contains input ports allowing Simulink sub- 
system blocks to be enabled and disabled, and output ports allowing subsystems 
to emit events to the controller. The connection can now be made by means of 
global variable passing. 

5 Computational Model with Global Variable Passing 

The mathematical model in section 4 is a natural way to conceptualise and 
model a multi-mode hybrid system. To implement such a system we have to 
transform these equations into a computational model. In this section we cast the 
generic mathematical model into the architectural framework presented earlier. 
In section 6 we provide three protocols for activation of the Signal part of the 
model. 

The plant is made of a collection of finite continuous time subsystems. As 
in the mathematical representation of section 4, let I be the cardinality of the 
collection and let i index over I. Each subsystem i contains a vector Xi G K”* of 
rii continuous state and also n-i differential equations. This set of equations can 
be rewritten as follows: 

( xj 

Ur 

Hence, the system contains differential equations for each q. That is, 

J = \Q \ X)fc=i differential equations in the continuous system. However, the 
implementation needs to extract the discrete parameter q G Q of these differen- 
tial equations. 

At any time t, one or several equations among this collection forms the basis for 
computation. Consider the whole set of system equations as follows: 






\fnd,x2\Ui,d,) ^ 



(6) 



Fi{x\,ui,di) = fl{qi,x\,ui,di) 
F2{x\,ui,di) = fl{q2,x\,ui,di) 

F\Q\{x\,Ui,di) = fl{qiQ\,x\,Ui,di) 
F|Q|+i(xf,Mi,di) = f^{qi,xj,ui,di) 



( 7 ) 




Co-Simulation of Hybrid Systems: Signal-Simulink 141 



Let j be a new index for indexing the system equations. Then, we can define a 
new function Fj for the jth equation in the above list. Now we can rewrite each 
differential equation as follows: 



Xj — Fj (xj , Uj , dj ) 



( 8 ) 



which allows to calculate the vector continuous state x and the vector continuous 
output y thanks to equation y = h{x, u). Then y feeds the characterizer, and the 
equation e = s{y) defines the detection of event e. 

Figure 3 shows one possible mapping of the mathematical representation 
into the architecture (later, we will see that this is not the only mapping). In 
comparison with Figure 2, a new component has been added in the controller, 
it is the Edge detector which corresponds to equation (4) . The discrete state q 
is defined only in the selector which is the only purely discrete part. So, the 
selector contains the rewritten form of equation (5): 



q' = T{q,T) T = {Tj,j = 1,.. .,J) 



(9) 



and the new equation below: 



c = g{q') 



(10) 



where c G IR"^ is the vector discrete control of the effector. The effector deduces 
from its input c two continuous vectors u G and enabl G B'^ thanks to: 



(uj,enablj) = k{cj) 



( 11 ) 



enabl j is used by the plant to enable or disable the jth differential equation and 
Uj is the vector continuous control of the jth differential equation. 

Since the discrete controller (the automaton) is in one state at any one com- 
putation point^, it follows that the change in continuous state is well-defined, 
i.e. although several equations are enabled in parallel, only one equation at a 
time is chosen for each continuous state variable. 



^ This is a property of the data-flow program ensured by formal analysis built-in in 
the compilers for synchronous languages. 




142 



S. Tudoret et al. 






Plant (physical environment) 

if {enablj = 1) then xj = fj{xj,Uj,dj) 



u 




enabl 


y 











Effector 




Characterize! 


(tt, enabl) = fc(c) 
z ' 




e = s{y) 



le 



Edge detector 



Selector (purely discrete part) 

9 ' = T(,,r) 

c ^ gW) 



Controller 



SIMULINK 



SIGNAL 



Fig. 3. Hybrid system representation 



6 Selector Activations 

The selector, i.e. the union of equations (9) and (10) is assumed to work in 
discrete time, meaning that continuous time t is sampled with period At. During 
each sampling period, the ej{t+At)) trajectory is recorded, and it is hoped 

that each component of Cj changes at most once during the sampling period. If ej 
changes during the sampling period then the event Tj is emitted. Then, there are 
several possibilities for checking the event Tj by the selector. These possibilities 
depend on how the selector is activated. Here we discuss three activation methods 
- i.e., periodic, aperiodic and asynchronous selector activations. 



6.1 Periodic Synchronous Selector Activations 

Synchronous means here that the selector activation coincides with a tick of the 
clock of the sampled continuous system. 

Protocol 1 At each sampling period At, the selector senses the final value of 
vector Tj, and applies its transition according to (9). 

This protocol is simple, but assumes that sampling period At is small enough 
to avoid missing events. This may typically lead to taking a At much smaller 
than really needed, i.e., to activate the automaton for nothing most of the time. 






Co-Simulation of Hybrid Systems: Signal-Simulink 143 



6.2 Aperiodic synchronous selector activations 

Protocol 2 Here the continuous time system ( equations (8)) is the master, dri- 
ven by continuous real time t. Each time some tj occurs a 'Wke_up” event is 
generated by the continuous time system in which tj was generated. Then 
selector (equation (9)) awaits for wake_up, so wake_up is the activation clock of 
the selector. When activated, the automaton checks which event tj is received, 
and moves accordingly, following equation (9). 

Within this protocol, the master is the continuous time system, and the sel- 
ector reacts to the events output by the continuous time system. More precisely, 
the continuous time system outputs wake-up (in addition to Tj), which in turn 
activates the selector. 

6.3 Asynchronous Selector Activations 

Here, continuous subsystems and the selector have independent Simulink threads, 
that means above all the selector has its own thread and its own activation 
clock. 

Protocol 3 At each round, the selector senses whether there is some event t, 
if it is the case then the selector moves accordingly, following equation (9) and 
finally, it outputs the state changes to the effector following equation (10). 

It is important to note that with Protocol 3 the r generation should be 
done in the Signal part instead of the Simulink part (compare with Figure 3). 
Indeed, if the t is provided by Simulink, there is a risk that the selector will 
miss some r because no assumption can be made about when the selector will 
check its input channels. In the best case some r are recognised with a delay of 
one tick in the selector. 

7 Application: The Siphon Pump 

The protocols for aperiodic and asynchronous selector activations have been 
implemented in our co-simulation environment [29]. In this section we give a 
brief exposition to application of the aperiodic protocol to a non-trivial example 
earlier introduced in [27,28] . This is a model of a siphon pump machine invented 
by the Swedish engineer Christofer Polhem in 1697. The purpose of the pump 
was to drain water from the Swedish copper mines with almost no movable 
parts. This works by having a system of interconnected open and closed tanks, 
and driving the water up to the ground level by adjusting the pressure in the 
closed tanks via shunt valves. The idea of the pump was so revolutionary in those 
times that the pump was never built. However, a model of the pump going back 
to the 17th century is the basis of the dimensions (and therefore the coefficients 
in the model) that we have used in our down-scaled model. Figure 4 shows a 
fragment of the pump consisting of the bottom three tanks. 




144 



S. Tudoret et al. 



The plant model has several interesting characteristics. First, even without 
the discrete controller, there are some discrete dynamic changes in the plant. 
These are brought about by the two check valves (hydro-mechanically) control- 
ling the flow of water between each open and closed container. Secondly, the 
plant dynamics (and also the closed loop dynamics) is non-linear. When the 
check valve between container i and container i -|- 1 is cracked, the flow of water 
in that pipe, denoted by is deflned by qi(i+i) = f{Pi(i+i),qi{i+i)) where / 

is a non-linear function, and Pi(^i+i) is the pressure in the pipe between container 
i and container z -|- 1. 

For closed-loop simulation we thus had to make an appropriate decomposi- 
tion, placing the purely discrete parts (including switching in the plant) in the 
Signal environment, and the purely continuous parts in the Simulink environ- 
ment. 




Fig. 4. A fraction of the siphon pump machine 



7.1 Working Principles 

The purpose of the pump is to lift the water which flows into the sump at the 
bottom of the mine to the drained ground level sump. This pump works in a two- 
phase (pull and push) manner as follows. The principle works for an arbitrary 
system of alternative closed and open tanks as follows. 



The pull phase In the pull phase, the pressure vessels (the closed tanks) are 
de-pressurised by opening the p~ side of the shunt valve which drains the vessels 
(the p~ side is connected to a negative pressure source e.g. a vacuum tank). 
Now, the water will be lifted from all the open containers to the pressure vessels 
immediately above. Hence, as a result of this first phase, all the pressure vessels 
will be water-filled. 





Co-Simulation of Hybrid Systems: Signal-Simulink 145 



The push phase In the push phase, the pressure vessels are pressurised by 
opening the side of the shunt valve to fill the air-compressing vessel with air 
(the P+ side represents a positive pressure source, e.g. created by an elevated 
lake above the mine). Now, all the pressure vessels will be emptied via the 
connections to the open containers immediately above. Hence, as a result of this 
second phase, all the open containers will again be filled with water. However, 
the water has now been shifted upwards half a section. By repeating these two 
phases the water is sequentially lifted to the ground level. 

Figure 4 depicts a fraction of the siphon pump machine. The water entering 
the bottom container (flow qi) is lifted to the top container by lowering and 
raising the pneumatic pressure Pc in the closed vessel. Due to the check valves 
(in between the open and closed valves), the water is forced to move upwards 
only. The reason why more than three containers and vessels are needed in 
practice, is that the vertical distance between any pair of vessel and container 
is strictly less than 10 meters since water can be lifted no higher than « . 10 
meters by means of the atmospheric pressure (« 1 bar) . In the sequel we assume 
that there are only three levels to the pump and the final flow variable q^, = <734. 




Fig. 5. General hybrid system architecture of the pump 



7.2 Mathematical Models 

From the high level description of the pump, it is possible to represent the simu- 
lated system by means of the architecture presented earlier. Thus, the system 
decomposition can be depicted as in Figure 5 . 





146 



S. Tudoret et al. 



At the topmost block, the pump has the external flow q\ [m^/s] entering 
container 1 as input and the external flow ^3 leaving the container 3 as output. 
The flow qi entering container 1 is determined by the environment (ground water 
entering the mine cannot be controlled but is deflned by Mother Nature). Hence 
qi is a disturbance signal. 

The closed loop system is modelled with the plant supplying control informa- 
tion to the effector, the characterizer and eventually to the selector. Obviously, 
the selector acts on the pneumatic pressure in container 2, i.e, increasing and 
decreasing Then the effector provides from and from the gravity induced 
hydraulic pressure due to accumulated water in containers (pi, p 2 and ps) the 
net driving pressure of the vertical pipes (pi2 and P23). Hence, in addition to qi, 
the plant uses pi2 and P23 to calculate the output flow (73. In order to stimulate 
the selector, the characterizer “watches” continuously the water levels of the 
containers {x\, X 2 and X3) and sends event r to the selector when it is necessary. 




Fig. 6. The architecture of the plant 



The refined model of the plant is depicted in Figure 6. It contains mainly two 
check valve systems. Each check valve system is a hybrid system. Indeed, the 
water flow through a check valve behaves differently according to the mode of 
the latter. In the checked mode the water flow is zero and in the cracked mode 
the water flow follows a non-linear differential equation (the interested reader 
is referred to the full report [29] for details of the plant model). Note that the 





Co-Simulation of Hybrid Systems: Signal-Simulink 147 



check valve can be modelled using both Simulink and Signal: The discrete 
mode changes are modelled in Signal and the rest in Simulink. 

7.3 The Control Strategy 

Finding a safe and optimal controller is far from easy. One of the more important 
requirements is to maximise the output flow <73 without risking that Xi will end 
up outside defined safe intervals. That is, to avoid overflow in the containers 
(and the mine), specially under all possible disturbances (gi). 

Another important requirement is related to energy consumption and main- 
tainability. It is important to minimize the number of switches of the value of 
Pc- Changing Pc from -|-50A:Pa to —50kPa and vice versa results in a significant 
amount of energy loss. One solution is to maintain Pc constant over as long 
periods as possible. 

A naive controller can be depicted by the automaton of Figure 7. This is not 
a robust controller and it was chosen to show the power of the co-simulation 
environment in illustrating its weaknesses. 



X 2 > UB2 OR xi < LB\ 




^3 > UB2 OR X 2 < LB2 



Fig. 7. Automaton implementing the control strategy in a selector, UBi and LBi are 
the level upper and lower limits in tank i respectively. 



The behaviour of this controller can be informally described as follows. 

1. The first discrete state, i.e., the Idle state, is the initialisation state. At the 
beginning, the three containers are empty. So it is necessary first to let the 
bottom containers fill. This is what is done in the Idle state. 

2. When the first container is full enough, an event is broadcast by a level 
sensor (which is simulated by the characterizer) and the pump moves from 
the Idle state to the Pull state. 

3. In the Pull state, container 2, i.e., the pressure vessel, is de-pressurised. 
Hence container 2 fills from container 1. Note that container 1 is continuously 
filled by the input flow qi which is uncontrollable. So the water level of 
container 1 moves according to the input flow qi and the flow <712 in the 
pipe between the two containers 1 and 2. When both are possible the level 
of container 1 either rises or falls. 




148 



S. Tudoret et al. 



4. If the water level of container 1 moves down until a given minimum threshold 
(detected by a sensor) or if the water level of container 2 is high enough then 
the pump moves from the Pull state to the Push state. 

5. In the Push state, container 2 is pressurised. Hence container 2 stops filling 
from container 1 and fills container 3. So, container 1 continues to fill accor- 
ding to the flow qi and container 3 fills according to the flow 523 (in the pipe 
between the two containers 2 and 3) and the output flow (73. Container 2 is 
of course emptied. 

6. Finally, if the water level of container 2 reaches its minimum threshold or 
if the water level of container 3 is high enough then the pump comes back 
from the Push state to the Pull state. Thus, the loop is closed. 

The above automaton shows which events lead to discrete state transitions 
of the selector and how these events are detected. Hence it is easy to model a 
characterizer which watches the different water levels and provides the suitable 
events. 

8 Analysis Results and Future Works 

In this section we present some co-simulation results. We study the behaviour 
of the closed loop system for given disturbance signals (incoming water into 
the bottom container) in presence of the naive controller. It is illustrated that 
while certain aspects of the behaviour are as expected, we also get unsatisfactory 
outputs. 




Fig. 8. Water levels of the system with qi = 2.10 ® m^/s 



First, observing the behaviour of the flow in the different pipes appears sa- 
tisfactory. However, that in itself is not sufficient for correctness of the pump 





Co-Simulation of Hybrid Systems: Signal-Simulink 149 



behaviour. Indeed, it is necessary to study the water levels in each container to 
check whether there is an overflow. Figure 8 shows such traces. The water level of 
the ith container is denoted by Xi and H denotes the height of the containers. At 
the beginning of the simulation, i.e., at time t = 0, the water level in container 
1 is 0.02 m and all the other containers are empty. What is important in these 
traces is that around t = 350 s container 1 overflows since xi reaches the value of 
H. Because water was not lifted fast enough against the input water flow qi. The 
controller is not to blame, since overflow is due to qi which is uncontrollable. 

The next plot shows that even if there is no overflow, the controller has a 
bad behaviour. That is, an infinitely fast switching behaviour in the shunt valve 
controller appears. This undesired behaviour of the system is a direct result of the 
naive control strategy adopted, not due to the chosen communication protocol. 
This lack of robustness in the controller is well-illustrated by the co-simulation, 
see Figure 9. 




Time [s) 



Fig. 9. The simulation result illustrating infinite switching. 



Current work includes experiments using the asynchronous protocol. Another 
interesting problem is to study the range of values for qi, for which the pump 
can work without problems; in particular, how simulation and formal verification 
can be combined to analyse such problems. Also, it is interesting to apply the 
combined environment to systems with more complex controller structure [24], 
where formal verification in SiGALi and co-simulation in the current environment 
are combined. 

A survey of related works on simulation of hybrid systems can be found in 
[23]. A typical requirement in dealing with hybrid simulation is that systems 
with uneven dynamics be simulated with variable step solvers so that rapid 
simulation and accuracy can be combined. Our work points out a weakness 




150 



S. Tudoret et al. 



in the code generation mechanism of Matlab which restricts the ability to 
use variable solvers. On the other hand, this may not be a problem in some 
application areas. For example, it was not considered as a critical issue when 
this work was presented at a forum including our industrial partners from the 
aerospace sector. 

References 

1. T. Amagbegnon, P. Le Guernic, H. Marchand, and E. Rutten. Signal- the spe- 
cification of a generic, verified production cell controller. In C. Lewerentz and 
T. Lindner, editors, Formal Development of Reactive Systems - Case Study Pro- 
duction Cell, number 891 in Lecture Notes in Computer Science, chapter 7, pages 
115-129. Springer Verlag, January 1995. 

2. A. Benveniste, B. Caillaud, and P. Le Guernic. Compositionality in Dataflow Syn- 
chronous Languages: Speciflcation and Distributed Code Generation. Information 
and Computation. To appear. 

3. A. Benveniste, B. Caillaud, and P. Le Guernic. From Synchrony to Asynchrony. 
In J.C.M. Baeten and S. Mauw, editors. Proceedings of the 10th International 
Conference on Concurrency Theory, CONCUR’99, LNCS 1664, pages 162-177. 
Springer Verlag, 1999. 

4. G. Berry. The Constructive Semantics of Pure Esterel. Technical re- 
port, Centre de Mathematiques Appliquees, 1999. Draft book, available from 
http : //www-sop. inria.fr/meije/esterel/doc/main-papers .html. 

5. F. Boussinot and R. De Simone. The ESTEREL language. Proceedings of the 
IEEE, 79(9):1293-1304, September 1991. 

6. W. Chan, R.J. Anderson, P. Beame, S. Burns, F. Modugno, D. Nothin, and J.D. 
Reese. Model Checking Large Software Specifications. IEEE Transactions on 
Software Engineering, 24:498-519, July 1998. 

7. T. Gautier and P. Le Guernic. Code generation in the SACRES project. In F. Red- 
mill and T. Andersson, editors, Towards System Safety, Proceedings of the Safety- 
critical Systems Symposium, SSS’99, pages 127-149, Huntingdon, UK, February 
1999. Springer Verlag. 

8. P. Le Guernic, T. Gautier, M. Le Borgne, and C. Le Maire. Programming real-time 
applications with Signal. Proceedings of the IEEE, 79(9): 1321-1336, September 
1991. 

9. N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous data flow 
programming language Lustre. Proceedings of the IEEE, 79(9): 1305-1320, Sep- 
tember 1991. 

10. N. Halbwachs, P. Raymond, and Y.-E. Proy. Verification of Linear Hybrid Sy- 
stems by means of Convex Approximations. In In proceedings of the International 
Symposium on Static Analysis SAS’9f, LNCS 864- Springer Verlag, September 
1993. 

11. D. Harel. Statecharts: A visual formalism for complex systems. Science of Com- 
puter Programming, 8:231-274, 1987. 

12. Integrated Systems Inc. SystemBuild v 5.0 User’s Guide. Santa Clara, CA, USA, 
1997. 

13. A. Kountouris and C. Wolinski. Hierarchical conditional dependency graphs for 
mutual exclusiveness identification. In 12th International Conference on VLSI 
Design, Goa, India, January 1999. 




Co-Simulation of Hybrid Systems: Signal-Simulink 151 



14. G. Lafferriere, G. J. Pappas, and S. Yovine. A New Class of Decidable Hybrid 
Systems. In proceedings of Hybrid Systems: Computation and Control, LNCS 1569, 
pages 137-151. Springer Verlag, March 1999. 

15. M. Le Borgne, H. Marchand, E. Rutten, and M. Samaan. Formal verification of 
signal programs: Application to a power transformer station controller. In Pro- 
ceedings of AMAST’96, LNCS 1101, pages 271-285, Munich, Germany, July 1996. 
Springer- Verlag. 

16. E. Marchand, E. Rutten, and F. Chaumette. From data-flow task to multi-tasking: 
Applying the synchronous approach to active vision in robotics. IEEE Trans, on 
Control Systems Technology, 5(2):200-216, March 1997. 

17. H. Marchand, P. Bournai, M. Le Borgne, and P. Le Guernic. A design environment 
for discrete-event controllers based on the signal language. In 1998 IEEE Inter- 
national Conf. On Systems, Man, And Cybernetics, pages 770-775, San Diego, 
California, USA, October 1998. 

18. H. Marchand and M. Samaan. On the incremental design of a power transfor- 
mer station controller using controller synthesis methodology. In World Congress 
on Formal Methods (FM’99), volume 1709 of LNCS, pages 1605-1624, Toulouse, 
France, September 1999. Springer Verlag. 

19. The MathWorks, Inc. Real-Time Workshop User’s Guide, May 1997. 

20. The MathWorks, Inc. Statefiow User’s Guide, May 1997. 

21. The MathWorks, Inc. Target Language Compiler Reference Guide, May 1997. 

22. The MathWorks, Inc. Using Simulink, January 1997. 

23. P. Mosterman. An Overview of Hybrid Simulation Phenomena and Their Support 
by Simulation Packages. In Hybrid Systems: Computation and Control, Proceedings 
of the second international workshop, March 1999, LNCS 1569, pages 168-177. 
Springer Verlag, March 1999. 

24. S. Nadjm-Tehrani and O. Akerlund. Combining Theorem Proving and Continuous 
Models in Synchronous Design. In Proceedings of the World Congress on Formal 
Methods, Volume II, LNCS 1709, pages 1384-1399. Springer Verlag, September 
1999. 

25. S. Nadjm-Tehrani and J-E. Stromberg. Verification of Dynamic Properties in an 
Aerospace application. Formal Methods in System Design, 14(2): 135-169, March 
1999. 

26. A. Puri and P. Varaiya. Verihcaion of Hybrid Systems Using Abstrations. In 
proceedings of Hybrid Systems II, LNCS 999, pages 359-369. Springer Verlag, 1994. 

27. J.-E. Stromberg. A mode switching modelling philosophy. PhD thesis, Linkbping 
University, Linkoping, 1994. Dissertation no. 353. 

28. J.-E. Stromberg and S. Nadjm-Tehrani. On discrete and hybrid representation of 
hybrid systems. In Proceedings of the SCS International Conference on Modeling 
and Simulation (ESM’94), pages 1085-1089, Barcelona, Spain, 1994. 

29. S. Tudoret. Signal-simulink: Hybrid system co-simulation. Technical Re- 
port cis-1999-020. Dept, of Computer and Information Science, Linkopings Uni- 
versity, December 1999. Currently available under Technical reports from 
http : //www. ida. liu. se/~eslab/publications . shtml. 




® § 



A System for Object Code Validation 



A. K. Bhattacharjee^, Gopa Sen^, S. D. Dhodapkar^*, K. Karunakar^, 
Basant Rajan^, and R. K. Shyamasundar^ 

^ Reactor Control Division, Bhabha Atomic Research Centre, Mumbai 400 085, India 
{ anup, gopa, sdd}@magnum. bare . ernet . in 
^ Independent V&V Group, Aeronautical Development Agency, Bangalore, India 
® STCS, Tata Institute of Fundamental Research, Mumbai 400 005, India 
{basant , shyam}@tcs . tif r . res . in 



Abstract. In several key safety-critical embedded applications, it has 
become mandatory to verify the process of translation by compilers since 
usually compilers are only certified rather than verified. In this paper, 
we shall describe a methodology and a system for the validation of trans- 
lation of a safe-subset of Ada to assembly language programs. The work 
described here is an application of Translation Validation technique to 
safety-critical programs that are developed using standard software en- 
gineering practices using safe subsets of Ada such as SPARK Ada [3]. 
Our method consists of converting the high level language (HLL) pro- 
gram and its object code to a common semantic representation such as 
Fair Transition System (FTS) [6], and then establishing that the object 
code is a refinement of the HLL program. The proof of refinement is 
performed using STeP (Stanford Temporal Prover) theorem prover. The 
proposed approach also has the additional advantage that the embedded 
system remains unaffected by compiler revisions/updates. We conclude 
with a discussion of our practical experience, effectiveness and further 
possibilities. 



1 Introduction 

In the development of software for safety critical applications, very high 
levels of confidence in the correctness of code is essential. The two steps 
in the realization of object code (cf. Fig. 1) are: 




Fig. 1. Two Main Steps from Specification to Realization 



* Corresponding Author 

Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 152-169, 2000. 
Springer- Verlag Berlin Heidelberg 2000 







A System for Object Code Validation 153 



1. Realizing a high level program from a given requirement specification 
and architectural specification. 

2. Generating the object code using the translator (or compiler). 

In Fig. 1, dotted lines are shown between the specification and the HLL 
realization to reflect several successive refinements whereas the solid ar- 
row between the HLL and the translator shows the direct nature of the 
refinement (translation). To show that the object code indeed realizes the 
given specification, we need to establish that: 

1. high level program is an implementation of the given specification, 

2. object code derived is a correct translation of HLL program. 

The first step depends upon the level of formalism at the specification 
level. In the current scenario, rigorous specification (e.g., B, Z, VDM etc) 
and development in high level languages (e.g., Ada, MISRA C[4] etc) is 
an industrial reality. The second step depends upon the correctness of the 
translator. In the context of compiler development, the step corresponds 
to the preservation of the meaning of program by the compiler. It may 
be noted that even certified compilers will have several known bugs many 
of which could affect executable code. Thus, one has to look for solutions 
such as: 

1. Verification of Compiler 

2. Formal verification of compiled code 

3. Establishing the equivalence of the source and the object code. 

Compiler verification is an extremely difficult task and almost impossi- 
ble (undecidable in general) and formal verification of compiled code is 
again extremely difficult. Translation Validation is an approach proposed 
in [1] that intrinsically realizes (3) in a pragmatic manner. Translation 
validation is based on Refinement Mappings [7] and can be used to esta- 
blish that the object code produced by the compiler on a given pass, is a 
correct implementation of the input program. Refinement Mappings have 
been used to prove that a lower-level specification correctly implements 
a higher-level specification. Pnueli et al. [1] proposed the technique of 
Translation Validation to show that a program in Signal [8] - a syn- 
chronous language - is correctly translated into C ~ an asynchronous 
language. 

In this paper, we shall propose a methodology supported with tools 
to establish equivalence of the given Ada source program with the cor- 
responding object code. The rationale is based on the fact that usually 




154 A.K. Bhattacharjee et al. 



in safety-critical embedded applications, it would suffice to establish cor- 
rectness of compilation of a finite set of programs. In our approach, we 
basically find out whether the object code generated can be proved to be 
an implementation of the Ada source program restricted to a safe subset 
such as SPARK Ada [3]. For this, both the source and the object programs 
are brought under a common semantic framework of FTS and then it is 
shown that the transition system of the object program is a refinement 
of the transition system of the Ada source program. The contribution of 
the paper lies in the proposal of a methodology for Object Code Valida- 
tion (OCV) with tool support for validating translations from a general 
purpose HLL to assembly language. Specihcally, the contributions are: 

1. A method for validating the translation of a safe-subset of Ada 
(SPARK subset) to i960 assembly programs. 

2. A suite of tools for (a) Translation of the safe subset of Ada into 
FTS, (b) Translation of object program (i960 assembly) into FTS, 
that enable validation of the given Ada program through the STeP 
[5] theorem prover using interface mapping built using symbol tables 
generated by the compiler. 

Rest of the paper is organized as follows: Section 2 gives an overview of 
translation validation and the related work. In section 3, an outline of the 
proposed OCV method is given. This is followed by a detailed treatment 
of refinements, proof obligations in STeP and a simple illustrative example 
of the method in section 4. In section 5, we give an example wherein the 
method detects translational errors. In section 6 implementation issues 
regarding support tools of OCV are discussed. The paper concludes with 
a summary of the experience, the status of the implementation and further 
possibilities. 

2 Translation Validation: An Overview 

For establishing the correctness of a compiler, one has to prove that 
the compiler always produces target code that correctly implements the 
source code. Owing to the intrinsic complexities of compiler verihcation, 
an alternative referred to as Translation validation has been explored in 
[1] . In this approach, each individual translation (i.e. a run of the compi- 
ler) is followed by a validation phase which verihes that the target code 
produced on this run correctly implements the source program. Such a 
possibility is particularly relevant for embedded systems where there is a 
need to execute a finite set of target programs. It must be pointed out 




A System for Object Code Validation 155 



that the validation task becomes increasingly difficult with the increase of 
sophistication and optimizations methods like scheduling of instructions 
as in RISC architectures or methods of code generation/optimization for 
super-scalar machines. In [1] , the authors demonstrated the the practica- 
bility of translation validation for a translator /compiler that translates 
the synchronous language Signal to C without any optimizations^. The 
method exploits the special features of the Signal compiler: 

1. Each program consists of an initialization followed by an infinite loop 
consisting of phases like, calculating clock expressions, reading inputs, 
computing outputs, writing outputs and updating previous expressi- 
ons. 

2. The compiler translates a program structurally. 

The question that arises is: 

Is it possible to apply the above teehnique to a non-synehronous 
language that does not use sueh struetural translation? 

It is very clear that by the very facts of difficulties of compiler verifica- 
tion mentioned already, we cannot extend the method in an unconstrained 
manner. Our first attempt has been to consider general purpose program- 
ming languages instead of synchronous languages such as SPARK subset 
[3] . Here, we shall describe a method wherein the above approach can be 
used for a compiler that translates SPARK subset of Ada to i960 assem- 
bly. The basic characteristics of the underlying language and its translator 
that we have exploited are: 

1. Source program satisfies safeness constraints as in SPARK Ada. 

2. The programs also pass through some of the well-established metrics 
of software engineering required for certification. 

3. The compiler is a certified industrial compiler that does not use com- 
plex optimizations; the assembler is a simple translator having almost 
a one-one correspondence between assembly and machine code. 

The above characteristics are indeed the minimum requirements imposed 
on translators used in critical embedded applications. 

2.1 Fair Transition System 

The common semantic framework viz. FTS [6], is formally described as 
T = {V, O, r, E) where, 

^ In [2], extension of the approach for TNI Signal compiler is explored. 




156 A.K. Bhattacharjee et al. 



- V = {ui,U 2 ■ ■ ■ Un} C jy : A finite set of system variables consisting of 
data variables, and control variables and n is the vocabulary. 

~ 0: The initial condition characterizing the initial states. 

~ T: A finite set of transitions, r G T is a function r : H i— )■ 2^ mapping 
each state s G U into a set of states r(s) C U. 

- E CV : A set of externally observable variables. 

A computation in FTS, E, denotes an infinite sequence 
a =< So, si, 'S 2 . . . >, where Si G E for each i G AA iff so |= 0 and 

{si, Sj-i-i) 1= r. 

3 An Overview of the OCV Method 

Our method for OCV consists of: 

1. Translating the given Ada program into SPL (Simple Programming 
Language used in STeP) or FTS. 

2. Translating the i960 program to SPL or FTS. 

3. Deriving the interface mapping using the symbol table generated by 
the compiler. 

4. Using STeP to show that the FTS for the object code is a refinement 
of the FTS for the source program. 




I Representation in Common 
I Semantic Framework 



Incorrect 
[(error detection) 



Theoem Proving 
Environment 



Interface Mapping 
f Between Abstract (A) i 
Concrete (C) System 



Fig. 2. Overall Object Code Validation Scheme 



The overall OCV scheme is diagrammatically shown in Figure 2. The task 
of proving correctness of refinement in general is quite arduous and almost 
impossible. However, under some constraints , it is indeed possible is 
shown in the sequel. The feasibility of our approach relies on the following 
aspects: 

1. Restricting the language subset to safe subsets enforces good structu- 
ral relations on the FTS for the source and the object program since 
we also assume that there are no complex optimizing transformations. 





A System for Object Code Validation 157 



2. The above structure also provides support for the Interface Mapping of 
variables. For instance, translations in an actual compiler like GNU 
gcc, that has options (eg, -gstabs, -gstabs+ etc.) for producing de- 
bugging information can be effectively used to derive correspondences 
between variables in the source and the object program. 

3. Use of interactive theorem provers such a STeP to establish the equi- 
valence. 

4 Formal Description of the OCV Method 

Firstly, the HLL program and it’s object code are translated into FTS 
using the two translators shown in Figure 2. The two fair transition sy- 
stems obtained become the input to the theorem proving tool STeP to 
carry out the proof of correctness of translation. The other input required 
is the Interface Mapping (cf. in Fig. 2) that provides a mapping between 
the variables in the HLL program and the object code. Although FTS 
is the final representation used in the proof, it is also possible to first 
translate the HLL and it’s object code into SPL, which is accepted by 
STeP as input. The translators shown in Figure 2 are designed to take 
one HLL program and it’s object code as input and produce the corre- 
sponding SPL programs. The representations of the HLL and it’s object 
code, in the form of FTS, could then be obtained using STeP. However, as 
explained later, we have found that in the case of object code it is easier 
to directly translate it to FTS. We illustrate the above concepts using a 
simple HLL program given below: 

int testCint a) 

{ int i,j=l,k=2; 
if (a) i=j*i +k; 
else i=k*i + j ; 

} 

The FTS of the above HLL program is shown in Fig. 3 and is refer- 
red to as Abstract System denoted by A={Va,Oa, Ta, Ea), where Va = 
{piO,i,j,k,l,n,a}, 9 a = {piO = 0 }, Fa = {Ti, T 2 , Ts, T 4 , Ts} and Ea = 
{a, Z, i, /c, j}; piO denotes program location counter. The transition sy- 
stem shown in Fig. 4 is obtained from the object code of the above HLL 
program. For lack of space, the object code is not shown here. In the 
sequel, the FTS corresponding to the object code is always referred to as 
Concrete System and is denoted by C={Vc,9c, Fc, Eq) where Vc = { 
piO, ebp_4, ebp_8,ebp_12,ebp_16, ebp 8 ,ecx,edx,eax,esp = {piO = 0}, 




158 A.K. Bhattacharjee et al. 




T1 

I I:=lJ;=2,k:=3 




Condition 

Action 



Fig. 3. Transition System for the Abstract System 



ebp8 
^ ^ax:=et>p_8 



T3 



T4 



eax:=ecx 



e«lx:=i 



:=ebp_12^ 



I e£p_4:=l,ebp_8: = l,ebp_12:=2 



ecx:=ebp_4*eax"'\ 

ro 

V 

G) 



— ebp8 

. ^^eax:=ebp 12 

XG ^ 

|T9 



lo; 
TIO- 



ecx:=ebp_4*eax 



eax:=ecx 



T Til 

^ edx:=ebp_12 



ecx:=eax+e<lx| 



T7 

ebp_lb:=ec: 



^ ebp_4 : =ebp. 



T ri2 ^ 

I ecx:=eax+e<lx 






ip_16:=ecx 



T14- 



ebp_4:=ebp_16 



Fig. 4. Transition System for the Concrete System 



Fc = {Ti, T 2 , . . . , Tis} and Ec = {ebpA, ebpA, ebpA2, ebpA6, ebp8}. The 
Concrete System variables ebpA, ebpA, ebpA2, ebpAQ and ebp8 corre- 
spond to the variables in the Abstract System. The correspondence bet- 
ween variables of the A and (7 is {i i-A ebpA,j i-A ebpA, k ebpA2, 1 1 — )■ 
ebpA6,a i-)- ebp8}. Thus, 1^4 C Vc- Even though in this example there 
is a direct one to one correspondence between the observables of the two 
systems, in general it need not be so. For example, HLL may support an 
abstract data type like a long integer (64 bit), which when mapped to 
a low level assembly program may be represented as two 32 bit integers 
(dl,d2) and the mapping will be an expression (2^^ * d2 + dl). It can 
be also observed that the number of states in the two transition systems 
are different {‘^Ec > ‘^Ea). A state in the HLL program (or it’s FTS) is 
said to correspond to a state in the object program ( as represented by its 
FTS) when they agree on the values of the observable variables. Formally, 




A System for Object Code Validation 159 



if syi G Ea and sc G Eq are two states of HLL and object program then 
SA i-A Sc iff V-UA G -Ea • va{sa) = vc(sc) where vc G EcAvai-^ vc(sc)- 
Here v(s) means the value of variable v interpreted in state s. Thus, in 
Figure 4 the shaded nodes are the states which correspond 

to states {0,3, 4, 3, 5 } in abstract system in that order. The states of the 
Concrete System that do not have a correspondence with those of the 
Abstract System are unobservable and hence are local to it. 

4.1 Correctness of Translation 

Let us consider an Abstract System A representing a HLL program and 
Concrete System C representing the corresponding object code. The sy- 
stem A can be viewed as a specification for the implementation C. A 
refinement mapping [7] is defined as a function (/ : Ec Ea) which 
maps concrete system states to abstract system states. If the translation 
is correct, C will preserve the essential features of A except for: 

- The concrete system need not agree with the abstract system on the 
values of all variables. The refinement relation singles out the obser- 
vable variables whose behavior should be preserved. 

- In the course of computation, the concrete system may require data 
movement in temporary locations (registers). This leads to the possi- 
bility of loosing one to one correspondence between states in the two 
systems. 

- The abstract system can operate in terms of high level abstract data 
types, while the concrete version is restricted to only those data ty- 
pes available in the particular architecture. Consequently, one should 
not always expect one to one correspondence between the concrete 
observable variables and the abstract ones. 

From the theory of refinement mapping [1,7], we have: 

- if / : Ec e- 7- Ea is an inductive refinement mapping from C to A then 
C is said to be a refinement of A i.e C C A 

- A refinement mapping / : Ec eA Ea is called inductive iff B-INV 
rule given below is satisfied.: 

Rl. Initiation: s |= 0c — ^ f{s) \= 0a-, V s G Ec and 

R2. Propagation: {s, s') \= Pc -G- {f{s), f{s')) \= Pa for all s, s' G Ec 

Given two FTSs, A and C, we have to show that Cis a correct implementa- 
tion of A or in other words C refines A. We assume that Ea = Va — {piO}. 
Let a : Va s(^c) be a substitution that replaces each abstract variable 




160 A.K. Bhattacharjee et al. 



V € Va hy an expression Ev over the concrete variables. Such a substi- 
tution a induces a mapping a between states. To show that C refines 
A it is required to show that [7] Rl: &c 4>^ R2: Vr G Fc • {4>}T{<j)}, 
R3: 0c — )• 0A[a] and R4: ^ (^[a], where 4> be an assertion on C and 

a : Va ^ £{Vc) is a substitution. The rules R1-R2 express the require- 
ment that (f) is an invariant of system C and R3-R4 express the requi- 
rement that a is an inductive refinement mapping. We define cj) to be 
an invariant defining the conditions under which observable variables are 
changed by the computation. 

4.2 Proof of Validation Using STeP 

Let UA be an observable variable in A and uc be the corresponding varia- 
ble in C. Consider a state Si in the abstract system where ua is defined. 
Let Iua(’^o ^ Si) be the path condition (conjunction of all predicates on 
the path (sq s*)) and cond{uA) be the disjunction of all such path 
conditions, because there can be more than one path from sq to s*. Then 
an invariant can be defined by 4> : atsi — )■ cond{uA) A ua = .... Thus 
taking the FTS in Fig. 3, we can have (j) : piO = 2^a>0Al = i*j + k 
and (j) : pfO = 3—)--' (a > 0)Al = i * k + 1. Such invariants can also be 
defined for the concrete system. The fact that these invariants are indeed 
true in the respective systems can be verified by the B-INV rule. If for any 
transition t G F, {4>}t{(I)} is not established, one can generate a weakest 
precondition (WPC)[9] which should hold good before the transition is 
taken so that cj) remains true after the transition. The WPG itself can 
be checked by applying B-INV. These can be applied repeatedly till it is 
proved or disproved. 

We define the substitution function a ■. Va ^ This func- 

tion defines the mapping between an abstract variable to its counter- 
part in the concrete system. In STeP this mapping is expressed by Sim- 
plify or Rewrite rules. These simplification rules are declared with the 
SIMPLIFY, REWRITE keywords. The SIMPLIFY rules are au- 
tomatically and exhaustively applied when STeP simplifier is invoked. 
REWRITE rules are applied interactively. We also define a mapping of 
states in the two system where the values of the observables in the two 
systems are same. Now if (j){u^) is an invariant in the concrete system for 
a concrete variable uf and <j){uf) is an invariant in the abstract system 
then if the translation is correct <(>(wp) 4>{a{uf)) must be true by rule 

R4. This should be true for all observable variables. The technique by 
which this is carried out is the MON-I[6] rule (modus ponens) which says 
if p is true and p ^ q then q is true. 




A System for Object Code Validation 161 



For the system shown in Fig. 3, we prove the following invariants 
(PROPERTY PA1-PA4) by using B-INV and WPG rules. For the systems 
shown in Fig. 3 and Fig 4 the script showing the proof requirement for 
correct implementation is given below: 

(*These are the invariants of the Abstract System and 
are shown in the syntax accepted by STeP.*) 

PROPERTY PAl: [] (piO = 2 --> a > 0 /\ 1 = i*j +k) 

PROPERTY PA2: [] (piO = 4 — > a > 0 /\ i = 1) 

PROPERTY PAS: [] (piO = 3 — > ~ (a > 0) /\ 1 = i*k +j) 

PROPERTY PA4: [] (piO = 5 --> ~(a > 0) /\ i = 1) 

( * Properties PC1-PC4 are invaricuits of the Concrete System *) 

PROPERTY PCl:[](piO=7 — > ebpS >0 /\ (ebp_16=ebp_8*ebp_4 + ebp_12)) 

PROPERTY PC2:[](piO=8 — > ebp8 >0 /\ ebp_4 = ebp_16) 

PROPERTY PC3:[](piO=14 — > ~(ebp8 > 0) /\ (ebp_16=ebp_12*ebp_4 +ebp_8)) 

PROPERTY PC4:[](piO=15 — > ~(ebp8 > 0) /\ ebp_4=ebp_16) 

(^Interface mapping between abstract variables i,j,k,l,a and the 
concrete variables -4(ebp) , -8(ebp), -12(ebp) ,-16(ebp) and 8(ebp). 

Here -1 mecuis the variable is unobservable *) 
value i:int*int — > int 

value j:int*int — > int 

value k:int*int — > int 

value l:int*int — > int 

value a:int*int — > int 

SIMPLIFY SI: i(pi0, ebp_4) > if piO >= 0 then ebp_4 else -1 

SIMPLIFY S2: j(piO, ebp_8) > if piO >= 0 then ebp_8 else -1 

SIMPLIFY S3: k(piO, ebp_12) — > if piO >= 0 then ebp_12 else -1 
SIMPLIFY S4: KpiO, ebp_16) — > if (pi0=7 \/ pi0=8 \/ pi0=14 
\/ pi0=15) then ebp_16 else -1 

SIMPLIFY S5: a(piO, ebp8) > if piO >= 0 then ebp8 else -1 

(* Axioms A1-A4 axiomatize the correspondence between observable states *) 
control pca:[0..5] (* control variable *) 

AXIOM Al: pi0=7 <==> pca=2 
AXIOM A2: pi0=14 <==> pca=3 
AXIOM A3: pi0=8 <==> pca=4 
AXIOM A4: pi0=15 <==> pca=5 

(* Properties P1-P4 are invariants of the Abstract System (written with 
mapping) . Proof Obligation for correct refinement *) 

PROPERTY Pl:[](pca = 2 — >a(pi0,ebp8) > 0 /\ l(pi0,ebp_16) = 
j (pi0,ebp_8)*i(pi0,ebp_4) + k(pi0,ebp_12) ) 

PROPERTY P2:[](pca = 4 — >a(pi0,ebp8) > 0 /\ i(pi0,ebp_4) = 

1 (piO , ebp_16) ) 

PROPERTY P3:[](pca = 3 --> ~ (a(piO, ebp8) > 0) /\ 1 (piO , ebp_16) = 
k(pi0,ebp_12)*i(pi0,ebp_4) + j (piO ,ebp_8) ) 

PROPERTY P4:[](pca = 5 — > ~ (a(piO, ebp8) > 0) /\ i(pi0,ebp_4) = 

1 (piO , ebp_16) ) 

Here properties PC1-PC4 are invariants of the concrete system. These 
are proved using B-INV and WPG as was done in proving properties 
PA1-PA4 for the abstract system. Properties P1-P4 are the properties of 
the abstract system (PA1-PA4) but written with substitution function. 
These properties are again proved using the properties PG1-PG4 and 
MON-I rule; note that premise R3 can be trivially proved. 




162 A.K. Bhattacharjee et al. 



5 Illustrative Example with Translation Error 

The above technique was tested on experimental basis on a number of 
test programs written in C language. The translations of C programs 
were deliberately seeded with errors to test the efficacy of the method. 
All seeded errors could be detected by carrying out the required proofs. 
Later the method was tested on Ada examples, that had real translation 
errors. One such example is given below in the form SPL representation 
of Ada and its object code. It was found that the version of the Ada 
compiler used, generated assembly code which executes a wrong path if 
the value of the case predicate is more than the last allowed value. This 
was detected during the proof. 

SPL Representation of Ada Source Program (Abstract System) 



macro LASTiint where LAST=1024 
(* 

macro Cl: [0..999] where Cl=10 
macro C2: [0..999] where C2=15 
macro C3: [0..999] where C3=20 
macro const : int where const=l 
*) 

in param:int 

local vl,pavl: [-LAST. .LAST] 
local v2,v3: [0. .999] 
local Cl: [0..999] where Cl=10 
local C2: [0..999] where C2=15 
local C3: [0..999] where C3=20 
local const: int where const=l 
case4: : [ 
vl : =param; 
v2:=v2*8; 

10: if v2 >= 1 /\ v2 <= 99 then 
[ v3:=Cl; 

11: skip ] 
else 

[ if v2>= 100 /\ v2 <= 199 \/ v2=201 then 
[ v3:=C2+Cl; 

12: skip ] 

else 

[ if v2=0 then 
[ v3:= const; 

13: skip ] 

else 

[ if v2=200 then 
[ v3:=v2 div 4; 

14: skip] 

else 

[ if v2=202 then 
[ vl:=pavl; 

15: skip] 

else 

[ if v2>=203 /\ v2 <= 999 then 




A System for Object Code Validation 163 



[ v3:=v2+ const; 
16: skip]]]]]]; 

pavl :=vl ; 

17 : skip] 



In the abstract system A = {V, 0, F, E} where V = {piO, param, vl, v2, 
v3, Cl, C2, C3, pavl}, 0 = piO = 0, E= the set of transitions and E = 
{vl,v2,v3,Cl,C2,C3}. 

The following are the invariants of the abstract system as included in the 
specification file (SPEC file) for the STeP. 

PROPERTY PI: ll==>K=v2 /\ v2 <= 99 /\ v3=Cl 

PROPERTY P2: 12==>( (100<=v2 /\ v2<= 199) \/ v2=201) /\ v3=(C2+Cl) 

PROPERTY P3: 13==>v2=0 /\v3=const 
PROPERTY P4: 14==>v2=200 /\ v3 = (v2 div 4) 

PROPERTY P5: 15==>v2=202 /\ vl=pavl 

PROPERTY P6: 16==>203<= v2 /\ v2 <= 999 /\ v3=(v2+const) 

These invariants are proved using the rule repeat (B-INV; Simplify; 
Undo; WPG). 

SPL Representation of i960 Object Program(Concrete System) 

in gO : int 

local r 1 , r2 , r3 , r4 , r5 , r6 , r7 , gl , g2 , g3 , g4 , g5 , g6 , g7 , vl , v2 , v3 , pavl : int 
local temp: int 
prog: : 

[ r5:=g0;temp:=r4 * 8;r4 := temp;r3:=r4; 

11: if r4>0 then 
[ g3:=99; 

12: if r3 > 99 then 

[ gl := 199; 

13: if r3 > gl then 

[ g6:=200; 

if r3 != g6 then 
[ g5 := 201; 

14: skip; 

15: if r3 = g5 then 

[ g6 : =25; v3 : =g6 ; 

16: skip] 

else 

[ g3:=202; 

17: if r3 != g3 then 

[ g6:= 203; 

18: if g6 >= r3 then 

[g5:=vl;g4 := r4+g5;v3:=g4; 

19: skip] 

else 

[ pavl:=r5; 

110: skip]] 

else 

[ gl:= pavl;vl:=gl; 
skip]]] 

else 

[ g6:=r4 div 4;v3:=g6; 



111 : 




164 



A.K. Bhattacharjee et al. 



112: skip]] 

else 

[ g7:=100; 

113: if g7 <= r3 then 

[g6 := 25;v3 := g6; 
114: skip]]] 

else 

[ if 1 <= r3 then 
[ g7:=10;v3:=g7; 

115: skip]]] 

else 

[g4:=12;v3:=g4; 

116: skip]] 



In the above concrete system C = {V, 0, F, E} where V = {piO, r3, r4, 
vl,Cl,C2,C3,pavl}, 0 = {piO = 0}, F= the set of transitions, E = 
{r3,vl,v2,v3,Cl^C2,C3,pavl}. The following are the invariants of the 
system as included in the specification file (SPEC file) for the STeP. 
These are also proved using the same rules as in case of abstract system 
invariants. 



macro Cl: Int where Cl=10 
macro C2: int where C2=15 
macro C3: int where C3=20 



macro const : int where const=12 
control pea: [0. .8] 

AXIOM Al: []Forall i:int.(i=r4 — >i=r3) 



PROPERTY PI 
PROPERTY P2 
PROPERTY P3 
PROPERTY P4 
PROPERTY P5 
PROPERTY P6 
PROPERTY P7 



115==>K=r3 /\ r3 <= 99 /\ v3=10 
114==>(100<=r3 A r3<= 199) /\ v3= 25 

16==>r3=201 /\ v3=25 
116==>r3<=0 /\ v3=12 
112==>r3=200/\v3=(r4 div 4) 

Hl==>r3=202 /\vl=pavl 
19==>203<=r3 /\v3=(r4+vl) 



The interface mapping between abstract system variables and the con- 
crete system variables and the proof obligations are shown below. 

local Vl,V2,V3,PAVl:int 

(* Interface Mapping between Abstract System variables 

and Concrete System variables *) 

value fvl : int*int*int — >int 

value fv2 : int*int*int — >int 

value fV2 : int*int*int — >int 

value f v3 : int*int*int — >int 

value fpavl : int*int*int — >int 

SIMPLIFY SI: f vl (piO , VI , vl) — > if piO >= 0 then vl else -1 

SIMPLIFY S2: fv2(piO,V2,r3) — > if piO >= 0 then r3 else -1 

SIMPLIFY S3: fv3(piO,V3,v3) — > if piO >= 0 then v3 else -1 

SIMPLIFY S4: fpavl(piO,PAVl,pavl) > if piO > 0 then pavl else -1 

(* State Correspondence between Abstract States and Concrete States *) 

AXIOM Ml:[] (pca=l<— >115) 

AXIOM M2:[] (pca=2<— >114) 

AXIOM M3:[] (pca=3<— >16) 




A System for Object Code Validation 165 



AXIOM M4 
AXIOM M5 
AXIOM M6 
AXIOM M7 
(* Proof 
PROPERTY 



[] (pca=4< — >116) 

□ (pca=5<— >112) 

[] (pca=6<— >111) 

[] (pca=7< — >19) 

Obligation for correct translation *) 



PI: 



PROPERTY P2: 



PROPERTY 

PROPERTY 

PROPERTY 

PROPERTY 

PROPERTY 



P7 

P3 

P4 

P5: 

P6: 



pca=l==> 1 <= fv2(piO,V2,r3) /\ fv2(piO,V2,r3)<= 99 /\ 
fv3(pi0,V3,v3)=10 

pca=2==>(100<=fv2(pi0,V2,r3) /\ fv2(piO,V2,r3)<= 199) /\ 
f v3 (piO , V3 , v3) = (C2+C1 ) 

pca=3==>fv2(pi0,V2,r3)=201 /\ fv3(piO,V3,v3)= (C2+C1) 
pca=4==>fv2(pi0,V2,r3)=0 /\fv3(piO,V3,v3)=const 
pca=12==>f v2 (piO , V2 , r3) =200/\ f v3 (piO , V3 , v3) = 

(f v2 (piO , V2 ,r3) div 4) 

pca=6==>fv2(pi0,V2,r3)=202 /\ f vl (piO , VI , vl)= 
f pavl (piO ,PAV1 ,pavl) 

pca=7==>203<= fv2(piO,V2,r3) /\ fv2(piO,V2,r3) <= 999 /\ 
fv3(pi0,V3,v3)=(fv2(pi0,V2,r3)+ fvl(piO,Vl,vl)) 



It is found that the property P3 and P6 could not be verified. The 
failure of P3 is because in the abstract system we have a state v3=const, 
with the path condition 13 — )■ u2 = 0 A u3 = const. Whereas in the 
concrete system, for the corresponding state where v3=const the path 
condition is v2 < 0. The failure of P6 is because the upper bound of v2 
is 999 which is missing in the concrete system. 



6 System for OCV: Implementation Features 

The main tasks of the implementation lie in generating the FTS for 
the source and object code, extracting interface mapping, and the al- 
gorithm for proof. Extraction of interface mapping information is done 
closely to that discussed in Section 3 so as to achieve the mapping semi- 
automatically. The algorithm for proof of validity follows on the lines of 
STeP theorem prover [5]. The translators shown in Fig. 2 for the Ada sub- 
set and corresponding object code have been implemented. The translator 
for Ada produces SPL output while that for object code produces Fair 
Transition System (FTS) for reasons explained in section 6.2. The fea- 
tures of generating SPL/FTS and the underlying modeling are discussed 
in the following. 

6.1 Generating SPL/FTS for the Ada Source 

Some of the features of the translation scheme are discussed below: 

- Each Ada function/procedure is translated into a SPL procedure. 

- The Ada has support for many types of data structures like records, 
aggregates, enumerated types which do not have corresponding types 




166 A.K. Bhattacharjee et al. 



in SPL. Hence, the translators implemented for Ada handle only the 
basic data types which have corresponding types in SPL. 

- The Ada has support for syntactic constructs like case do .. . while 
etc. which do not have corresponding statements in SPL. This requires 
that the translator use the concept of abstract interpretation to map a 
given construct into a set of equivalent statements. Thus, for example 
a case statement will be translated as a if .. . then . . . else if .. . in 
SPL. 

- The function invocations in Ada and object code are translated to 
corresponding function invocations in SPL or FTS. These functions 
are required not to have any side-effects (STeP assumption). 

- Procedures are modeled as multiple functional assignments to out 
variables. 

The in and out variables and their mutual dependencies are required to 
be explicitly annotated (as SPARK annotations) in the programs input to 
the translators, so that the functional assignment relationships between 
the in and out variables can be inferred. Some illustrative examples are 
shown in Table 1. Let us consider a procedure to multiply two matrices 
and the result returned in a third matrix. In SPL, the data type Matrix 
is to be specified as user defined type with corresponding axioms (not 
shown). The annotated Ada declaration and the corresponding SPL code 
is shown in Table 1. 



Table 1. Ada-SPL mapping 



Annotated Ada Declaration 


SPL Translation 


Multiply (X,Y: in Matrix, Z:out Matrix) 
derives Z from X,Y 


in X,Y:Matrix 
out Z:Matrix 
Z:=Multiply_Z(X,Y) 


Exchange(X,Y: in out float) 
-# derives X from Y 
-# Y from X; 


in X,Y:rat 
out_niX,_mY :rat 
_niX:=Exchange_X(Y) 
X:= mX 

_mY :=Exchange_Y (X) 
Y:= _mY 



Consider a procedure Exchange for exchanging two variables as shown 
in Table 1. which is called from a procedure being analyzed. The conven- 
tion of naming the functions in SPL modeling the procedure is <procedure 
name_<Ada variable name which is exported > (list of variables modify- 
ing the exported variable). The assumptions made in translating functions 
and procedures can be easily ensured by using SPARK tools [3] for Ada. 





A System for Object Code Validation 167 



Therefore, the analysis support provided by likes of SPARK environment 
is essential before the validation of object code is undertaken. 

6.2 Generating FTS for the Object Code 

The object code is translated directly into FTS for reasons explained be- 
low. The iterative type of statements from HLL program may be transla- 
ted to object code by using conditional and unconditional branching sta- 
tements because i960 processor has only binary branch statements and 
no loop statements. Since the predicates in the loop type of statement 
in the HLL (Ada) may be complex i.e. containing conjunction and dis- 
junction of variables, it becomes very difficult to reconstruct a loop type 
construct in SPL from the form in which exists it in object code. This 
reverse engineering can only be done through extensive flow graph analy- 
sis. However, since the Fair Transition System syntax supports goto, it is 
straight forward to translate such constructs into Fair Transition Systems 
rather than into SPL. Hence the translator implemented for object code 
produces FTS directly instead of an SPL program. 

In the implementation of the translator from i960 assembly instruc- 
tion to FTS the main task lies in modelling the i960 assembly instructions 
and some illustrative instructions modelling are discussed in the following. 
Since many of the instructions in the instruction set have implicit ope- 
ration, it is required to use SIMPLIFY/REWRITE rules to model the 
effect of the instruction. Some of the illustrative instructions are model- 
led as shown in Table. 2. Let us take for example the cmpi srcl, src2 
instruction which compares two integers and sets the condition flag cc to 
4, 2 or 1 depending on the condition srcl < src2, srcl=src2 or srcl 
> src2. This is modelled as an assignment cc:=cmp(srcl,src2) Let us 



Table 2. i960 assembly-FTS mapping 



Instruction 


SPL/FTS Statement, Declaration Rule 


cmpi srcl,src2 


cc:=cmp(srcl,src2) value cmp : int * int — > int 

SIMPLIFY cmp{srcl, src2) — )■ 
if(5rcl < src2) then 4 else 
if(srcl = src2) then 2 else 
if (srcl > src2) then 1 else 0 


shli len,src,dst 
shri len,src,dst 


dst:=src*power(2,len) value power : int * int — > 

int 

dst:=src div power(2,len) REWRITE \/m,n : intm 
power(m, n) 
if(n = 0) then 1 else 
if(n = 1) then m else 
m 7k power{m, n — 1) 




168 A.K. Bhattacharjee et al. 



consider the example of arithmetic shift shli and shri shown in Table 2. 
A multiplication or division by some power of two is usually implemented 
by an arithmetic shift left or right respectively when translated by the 
compiler. 

The i960 processor supports movement of data between floating registers 
fp0-fp3 and other registers like r0-rl5 and g0-gl5. Since in our implemen- 
tation the register sets rO-15 and g0-gl5 are declared as integer types, ope- 
rations involving this requires AXIOMs : Vm : rat • {Real{Int{m)) = m) 
and Vm : int • {Int{Real{m)) = m). 

7 Conclusion and Future Work 

OCV method was first tested successfully on test programs by carrying 
out manual translations to SPL or FTS. The implementation of transla- 
tors from Ada subset to SPL and object code to FTS have been completed 
and trial proofs have been carried out. The translations are limited to data 
types supported in SPL. Using the system in the current form, we have 
been able to validate several Ada programs that are in actual use and 
had translation errors. 

The object code validation is a task requiring special skills even in 
the presence of mechanized theorem provers. The implementation of the 
translators to SPL/FTS is a major step in reducing the total effort in- 
volved in object code validation. Human interaction is still required in 
constructing the interface mappings and in carrying out the proofs. 

The technique generally works fine if there is a structural correspon- 
dence between the flow graph of the two programs. The common seman- 
tic representation should have capability to handle different type of data 
structures normally used in HLLs like Ada. Each verifiable unit, which 
is a function or procedure in our case, should be small enough, so that 
it is easily possible to establish state correspondence and construct inter- 
face mapping. This is not a problem if the software is nicely modularized 
following good software engineering practices where each module has a 
small cyclomatic number. This is generally a requirement for software to 
be used in safety-critical system. 

The process of validation is quite arduous. However, even with our 
preliminary experience we find it to be very useful for the validator. On 
the fly validation[10] aids in generating correct code as large fraction of of 
target-dependent errors in compilers can be detected. Program annotation 
and assertions aid in the proof. If such assertions are carried into the 




A System for Object Code Validation 169 



object code, it will aid in handling optimizations, and also in certifying 
compilers that use the notions such as proof-carrying code. 

References 

1. Pneuli A., Siegel M., Singerman E.: Translation validation Proc. 4th TACAS, 
LNCS 1384, pp. 151-166. Springer- Verlag, 1998. 

2. Pnueli A., Siegel M., and Shtrichman O.: Translation Validation for Synchronous 
Languages, Proc. 25th ICALP, LNCS, 1443, pp. 235-246, Springer- Verlag, 1998 

3. Barnes HoYm.-.High Integrity Ada: The SPARK Approach, Addison- Wesley, 1997 

4. Motor Industry Safety and Reliability Association(MISRA) of U.K., Guidelines 
for the use of the C language in vehicle based software MIRA, 1998 

5. Manna Z. et. aXr.STeP : The Stanford Temporal Prover, version 1.2 Educational 
Release, Users Manual, CS Dept., Standford Univ., 1996 

6. Manna Z., Pneuli A. : Temporal Verification of Reactive Systems Springer Verlag, 
1995 

7. Abadi M., Lamport L. The existance of refinement mappings. Theoretical Compu- 
ter Science, 82, pp. , Elsevier, 1991 

8. Benvinste A., P. Le Guernic, Jacquemot C.: Synchronous Programming with event 
and relations: the SIGNAL language and its semantics. SCP, 16, pp. 1991. 

9. Dijkstra E.W : A Discipline of Programming, Prentice Hall, 1967 

10. G.C. Necula, Gompiling With Proofs, Ph.D. Thesis, CMU, 1998 




Real-Time Program Refinement 
Using Auxiliary Variables 



Ian Hayes 

School of Computer Science and Electrical Engineering, 
The University of Queensland, Brisbane, 4072, Australia. 
ianh@csee .uq. edu. au 



Abstract. Real-time program development can be split into a machine- 
independent phase, that deriv es a machine-independent real-time pro- 
gram from a specification, and a machine-dependent phase, that checks 
that the compiled program will meet its deadlines when executed on the 
target machine. 

In this paper we extend a machine-independent real-time programming 
language with auxiliary variables. These are introduced to facilitate both 
reasoning about the correctness of real-time programs and the expression 
of timing deadlines, and hence the calculation of timing constraints on 
paths through a program. The auxiliary variable concept is extended to 
auxiliary parameters to procedures. 



1 Introduction 

Our overall goal is to provide a method for the formal development of real-time 
programs. One problem with real-time programming is that the timing charac- 
teristics of a program are not known until it is compiled for a particular machine, 
whereas we would prefer a machine-independent program development method. 
The approach w e ha vetak en is to partition program dev elopmeit into two 
phases: a machine-independent phase, in which a machine-independent program 
is deriv ed from a specification; and amachine- dependent phase, in which the pro- 
gram is compiled and is chec ked to ensure that all deadlines within the program 
are met. The approach is facilitated by the use of a machine-independent real- 
time programming language, that extends a standard real-time programming 
language with constructs to allow the expression of deadlines. The crucial exten- 
sion is a deadline command [6, 1], of the form ‘deadline/!’ that on execution 
takes no time and guaraitees to complete by absolute time D. In isolation such 
a command cannot be implemented, but if it can be shown that all execution 
paths leading to the deadline reach it before time D, then it can be removed. 
This process can itself be split into tw o phasestiming constr aint analysis which 
for eac h non-dead path [| leading to a deadline determines the timing constraint 
that guarantees that the deadline will be met [2]; and worst-case exe cution-time 
analysis, which checks that the worst-case execution time of the code on each 
path meets its timing constraint [11 ]. 



M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 170-184, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




Real-Time Program Refinement Using Auxiliary Variables 171 



In addition to the deadline command, the machine-independent programming 
language may also contain logical constants and assumptions [12, 7]. These al- 
low assumptions about the program state, including timing assumptions, to be 
expressed within the program and hence facilitate timing constraint analysis. 

In this paper we add auxiliary variables to our machine-independent pro- 
gramming language. Auxiliary variables may only be used within specifications, 
assumptions, deadlines and assignments to auxiliary variables. Hence they do 
not have to be implemented in the compiled code. During the refinement pro- 
cess they can be introduced to simplify the development process and to allow 
the expression of timing deadlines. Auxiliary variables can be of any type, but of 
particular significance are auxiliary variables of type Time. These can be used 
to refer to the time of significant events in the environment of the program or 
to the time at which a point in the program is reached. 

We also allow auxiliary parameters to procedures. These allow auxiliary in- 
formation to be passed across a procedure’s interface. The auxiliary parameters 
are used to facilitate the specification of timing deadlines in one component 
with respect to events in another. There is no need to actually pass auxiliary 
parameters in the compiled code. 

Related work. Hooman and Van Roosmalen [10] have developed a platform- 
independent approach to real-time software development similar to ours. Their 
approach makes use of timing annotations that are associated with commands. 
The annotations allow the capture in auxiliary timing variables of the time 
of occurrence of significant events that occur with the associated command, 
and the expression of timing deadlines on the command relative to such timing 
variables. They give an example of a program that reads an input value x from 
d \ , calculates y as some function of x and outputs y to (I 2 • 

m(di,a:)[m?]; y := f(x); out(d 2 , y)[< m + U] 

The constructs in square brackets are timing annotations [10, Sect. 2]. On the 
input the annotation ‘m?’ indicates that the time at which the input occurs 
should be assigned to timing variable m, and on the output the annotation ‘< 
m-b t/’ requires the output to take effect before m+ U,i.e., within U time units 
of the input time. Hooman and Van Roosmalen keep timing annotations separate 
from the rest of the program. They recognise that this syntactic restriction is 
not necessary but advance the following arguments [10, Sect. 2]. 

1. By not introducing timing variables in the program domain it is possible 
to first construct a functionally correct program, and then consider timing 
requirements. 

2. By forbidding the use of program variables in the time domain, it becomes 
syntactically impossible to introduce data dependencies in the timing re- 
quirements. Such data dependencies usually complicate correctness proofs 
considerably. 

With regard to the first point, our experience indicates that the specification of 
a real-time program often combines both timing and functional requirements in 




172 I. Hayes 



an intertwined fashion. Part of the development process is separating out the 
timing requirements. 

With regard to the second point, we agree that data dependencies can com- 
plicate correctness proofs, but for some applications they are unavoidable, and 
hence we would like our methods to allow them. For example, a real-time pro- 
gram handling a low-level communications protocol may read a value from an 
input channel that indicates the length of the rest of the message, and the time 
constraint on reading the rest of the message is dependent on its length. 

Our work builds on all the above work to develop an approach that is more 
general. As in our earlier work timing deadlines are considered commands (rather 
than annotations) and given a semantics as for any other command. The gener- 
alisation introduced in this paper allows timing events, as well as other auxiliary 
information, to be captured and used in commands and as auxiliary parame- 
ters to procedures. The generalisations allow for a more flexible approach to 
specifying and reasoning about timing constraints. 

Sect. 2 introduces the machine-independent, wi de-spectrum language used 
for specification, and refinement to code. Sect. 3 presents an example refinement 
that makes use of auxiliary variables and auxiliary procedure parameters, and 
Sect. 4 discusses timing constraint analysis. 



2 Language and semantics 

We model time by nonnegative real numbers: 

Time = {r : real | 0 < r < oo}. 

The real-time refinement calculus makes use of a special real-valued variable, r, 
for the current time. To allow for nonterminating programs, we allow r to take 
on the value infinity (oo): 

Timeoo = Time U {oo}. 

We refer to the set of variables in scope as the environment, and use the name p 
for the environment. In real-time programs we distinguish five kinds of variables: 
inputs, p.in, which are under external control; outputs, p.out, which are under 
the control of the program; local variables, p.loeal, which are under the control 
of the program, but unlike outputs are not externally visible; auxiliary variables, 
p.aux, which are similar to local variables, but are restricted to appear only in 
assumptions, specifications, deadline commands and assignments to auxiliary 
variables; and the current time variable, r. Inputs and outputs are modelled 
as functions from Time to the declared type of the variable, e.g., given the 
declaration, ‘input beam : boolean’, beam is modelled as a function from Time 
to boolean, with beam{t) giving the value of beam at time t. Note that it is not 
meaningful to talk about the value of a variable at time infinity. Only the current 
time variable, r, may take on the value infinity. 




Real-Time Program Refinement Using Auxiliary Variables 1 73 



We use the term state to refer to the non-external variables, (i.e., the non- 
trace variables), p. state = p.loeal U p.aux U {r}. State variables are modelled by 
values of their declared type ( Timeoo for r) . 

In earlier work [13, 7] all variables (including locals) were modelled as func- 
tions of time (timed traces). For auxiliary variables this is not possible, because 
assignments to auxiliary variables take no time and a timed trace only allows a 
variable to have a single value at any one time. Hence within the semantics of a 
command, we represent an auxiliary variable, x, by its value before the execution 
of the command, xp, and its value after the execution of the command, x. Having 
introduced this model for auxiliary variables, we decided to use the same model 
for local variables. Either model could be used for local variables, but choosing 
a similar model for auxiliary and local variables makes the semantics a little 
simpler. 

In this paper we represent the semantics of a command by a predicate in a 
form similar to that of Hehner [8, 9]. The predicate relates the initial and final 
values of the state variables as well as constraining the traces of the outputs 
over time. All our commands insist that time does not go backwards: tq < r. 
The meaning function, M, takes the variables in scope, p, and a command C 
and returns the corresponding predicate, Mp (C). As for Hehner, refinement of 
commands (in an environment, p) is defined as reverse entailment: 

C rpD = Mp(C) ^Mp(D) 

where ‘P ^ Q’ holds if for all possible values of the variables, whenever Q holds, 
P holds. 

Real-time speeifieation eommand. We define a possibly nonterminating real-time 
speeifieation eommand similar to that of Morgan [12], 

oox: [P, Q] , 

where a; is a vector of variables called the frame, the predicate P is the assump- 
tion made by the specification, and the predicate Q is its effect. The ‘oo’ at the 
beginning is just part of the syntax; it reminds us that the command might not 
terminate. The assumption P is assumed to hold at the start time of the com- 
mand. The frame, x, of a specification command lists those variables that may be 
modified by the command. The frame may not include inputs. The current time 
variable, r, is implicitly in the frame. All outputs not in the frame, i.e., those 
in p.out but not x, are defined to be stable for the duration of the command, 
provided the assumption holds initially. We define the predicate stable by 

stable{v,S) = S' {} => (3 a: • n(|S|) = {a:}) 

where ?;(|S|) is the image of the set S through the function v. We allow the first 
argument of stable to be a vector of variables, in which case all variables in the 
vector are stable. To specify the closed interval of times from s until t, we use 
the notation fs ... t^. The open interval is specified by (-s ... tf. We also allow 
half-open, half-closed intervals. The operator ‘\’ is set difference. 




174 I. Hayes 



Definition 1 (real-time specification). Given variables, p, a frame, x eon- 
tained in p.loeal U p.aux U p.out, a predieate P involving the variables in p (in- 
eluding t), and a predieate Q involving variables in p, and initial variables (zero- 
subseripted variables) eorresponding to those in p. state, the meaning of a possibly 
nonterminating real-time speeifieation eommand is defined by the following. 

Mp[oox.\P, Q]) (^(^To < oo A Pq) => {Q A stable{p.out\x,^To ...T^))) 

Pq stands for the predieate P with all oeeurrenees of t, and loeal and auxiliary 
variables that are in the frame, replaeed by their zero-subseripted forms. 

Note that if P does not hold initially the command still guarantees that time 
does not go backwards. Because r may take on the value infinity, the above 
specification command allows nontermination. As abbreviations, if P is omitted, 
then it is taken to be true, and if the frame is empty the is omitted. 

Primitive real-time eommands can be defined in terms of equivalent specification 
commands. In Fig. 1 we define: a terminating specification command, x: [P, Q ] ; 
the null command, skip, that does nothing and takes no time; a command, idle, 
that does nothing but may take time; an absolute delay command; a multiple 
assignment; the deadline command; a command, read, to sample a value from 
an external input; a command, write, to output a value to an external output, o, 
(we allow references to o within the expression B - these refer to the initial value 
of o); a command, gettime, to obtain the current time; and an assumption. The 
expressions used in the commands are assumed to be idle-stable, that is, their 
value does not change over time provided all the variables under the control of 
the program are stable. In practice this means that the expressions cannot refer 
to r or to the value of external inputs. 

Definition 2 (idle-stable). Given variables p, an expression E is idle-stable 
provided, tq < t A stable{p.out, fro ... rf) ^ E [r\ro] = E. 

The deadline command is unusual. It takes no time and guarantees to com- 
plete by the given deadline. It is not possible to implement a deadline command 
by generating code. Instead we need to check that the code generated for a pro- 
gram that contains a deadline command will always reach the deadline command 
by its deadline [2]. We discuss this further in Sect. 4. 

Gompound real-time eommands. Because we allow nonterminating commands, 
we need to be careful with our definition of sequential composition. If the first 
command of the sequential composition does not terminate, then we want the 
effect of the sequential composition on the values of the outputs over time to be 
the same as the effect of the first command. This is achieved by ensuring that 
for any command in out language, if it is executed at r = oo, it has no effect. 
(For the specification command this is achieved by the assumption tq < oo in 
its definition.) Here we provide a definition of sequential composition in terms 
of the effects of the two commands. 




Real-Time Program Refinement Using Auxiliary Variables 175 



Definition 3 (primitive real-time commands). Given a vector of variables, x, 
not including any inputs; a predicate, P, with no references to initial state variables; a 
predicate, Q , that may refer to initial state variables; an idle-stable, time-valued expres- 
sion D; a vector of idle-stable expressions, E, of the same length as x and assignment 
compatible with x; a local variable, v; an input i that is assignment compatible with v; 
an output o; an idle-stable expression, B, that is assignment compatible with o; and a 
time-valued local variable, t; the real-time commands are defined as follows. 



x:[P, Q] = ooa;: [P, Q At < oo] 
skip = [ro = r] 
idle = [ro < r] 
delay until U = [U < r] 
x:= E= x:[def{E), x=E[^]], 
where x does not include outputs 



deadline D = [ro = r < U] 

V : read(j) = v: ^v € i(|(-ro ... r^D] 
o : write(B) = 
o: [def{Bo), o(t) = Bo] 
t : gettime = t: [ro < t < r] 
{P}=[P, ro = r] 



Predicate def(E) characterises those states in which the expressions E are well defined, 
i.e., there are no divisions by zero, etc. For the deadline command, D need not be 
idle-stable because a deadline takes no time. For the write command, Bq stands for 
B [o\o(ro)]. 



Fig. 1. Definition of primitive real-time commands 



Definition 4 (seqnential composition). Given variables, p, and real-time 
eommands, C and D, their sequential eomposition is defined by the following. 

Mp {C] D) =3 p. state' • Mp (C) [p.state\p. state'] A 

A4p (D) [p.stateo\p. state'] 

Recall that p. state = p.loeal U p.aux U {r}. Note that even if the precondition of 
the second command does not hold, the sequential composition still guarantees 
the effect of the first command and that the finish time is greater than or equal 
to the finish time of the first command. 

A variable block introduces a new local or auxiliary variable. The allocation 
and deallocation of a local variable may take time. This is allowed for in the 
definition by the use of idle commands. 

Definition 5 (block). Given an environment, p, and a eommand, G , 

Mp (|[ var G ]|) A (3 vq, v • Mp' (idle; G; idle)) 
where p' is p updated with the heal variable v, and 
Mp (|[ anxa:; G ]|) = {3xo,x • Mp" (C)) 
where p" is p updated with the auxiliary variable x. 






176 I. Hayes 



We abbreviate multiple declarations with distinct names by merging them into 
a single block, e.g., |[ var auxa:; C ]| = |[ var |[ auxa:; C ]| ]|. 

Due to space limitations we do not attempt to give a complete definition of 
loops; more complete details can be found elsewhere [3]. Each iteration of a loop 
takes a minimum amount of time, d, which is strictly positive to avoid Zeno-like 
behaviour. A single branch loop, DO = do B ^ C od, can be characterised as 
follows: there exists a strictly positive time, d, such that 

DO = (|[ cons; {r = s} ; [B] ; C; delay untils + d ]|; DO) [ [-■ B] 

There is a (deterministic) choice between two alternatives. The second alterna- 
tive corresponds to the guard evaluating to false and termination of the loop. 
The first alternative corresponds to the guard evaluating to true. The logical 
constant s captures the start time of a single iteration. The guard evaluation 
(which typically takes time unless, for example, the guard is the constant true) 
is followed by the execution of the command, C. A delay is included at the end 
of an iteration to ensure the time is at least d time units later than the start 
time of the iteration, s. This ensures that even if the guard is the constant true 
and the body is the null command skip, each iteration takes at least d time 
units and hence Zeno-like behaviour is avoided. 

3 An example 

Speeifieation. To illustrate our approach we use the example of a conveyor belt 
that transports objects which are measured for their size and then sorted into 
a corresponding bin. A light beam is used to detect objects, and measure their 
size. The boolean input beam represents the detection of the light beam: its 
value is false (no light) at time t if and only if there is an object on the conveyor 
blocking the beam at time t. (We ignore failures of the light beam, etc.) The 
boolean output Ihin selects between a bin for large objects (if it is true) and 
a bin for small objects (if it is false). The objects on the conveyor belt have a 
minimum length and separation. This translates to there being a minimum time, 
MinW , for which beam is false while an object passes the beam, and a minimum 
time, MinS, for which beam is true between objects. 

To represent these properties we introduce two logical constants (specifica- 
tion variables) gs and ge, which represent the increasing sequences of times at 
which the beam goes true (no object) and false (object), respectively. The name 
gs (ge) abbreviates ‘gap start’ (‘end’), where the gap in question is the gap be- 
tween objects during which beam is true. We include the initial gap before the 
first object and after the last object (if there are a finite number of objects). 
Both gs and ge are completely derived from beam. They are used purely to 
simplify expression of the properties of beam. Logical constants can be used for 
specification purposes but cannot appear in the compiled code of a program. 
The sequences begin with index one, have the same domain, and may be infinite 
(indicating an infinite number of objects passing on the conveyor over all time). 
We assume that there is no object on the conveyor for an initial period of at 




Real-Time Program Refinement Using Auxiliary Variables 177 



least MinS, i.e., 5 s(l) is zero and ge{l) is greater than or equal to MinS. The 
value of ge{l) represents the time at which the first object breaks the beam. If 
there are only a finite number of objects that pass on the conveyor then the value 
of the last element of ge will be infinity, e.g., if no objects at all pass the light 
beam, then ge{l) would be infinity. The following gives a specification of these 
variables, and the assumptions we make about them. The notation seq°° Time 
stands for the type of possibly infinite sequences of times, with indices starting 
at one. As well as giving the types of the variables and constants, we also give 
their units of measurement [5] . 

input 6eam : boolean] output /6m : boolean; 

con gs,ge : seq°° Time; 

const MinS = 40 ms ; Min W = 20 ms ; MaxW = 40 ms ; 

{ dom ge = dom gs A 1 G dom gs A 5 s(l) = 0 A 'I 

(V i : dom gs • MinS < ge{i) — gs{i) A I 

(* 1 => MinW < gs{i) — ge{i — 1) < MaxW)) A | 

(V t : Time • beam{t) {3i : dom t € i:gs{i) ... ge{i)^)) j 

The task of the program is to measure the size of the passing objects, and 
select the bin into which they are to be placed. We assume that the conveyor 
moves with velocity vel metres per second. The size of an object can only be 
measured approximately. Hence the specification allows a margin of error, mrgn, 
in determining whether an object is large or small. If an object is of size greater 
than or equal to limit + mrgn then it must go in the large bin. If its size is less 
than or equal to limit — mrgn it must go in the small bin. Objects with sizes 
between limit — mrgn and limit + mrgn can go in either bin. We assume that the 
value of mrgn has been adjusted (decreased) to take into account fluctuations 
in the velocity of the conveyor. The predicate ObjSize relates the jth object to 
the bins it is allowed to be placed in. 

coust vel = Im / s; limit = 30 mm ; mrgn = 1 mm ; bin-limit = 10 ms ; 
ObjSize(j, b) = let sz = vel * (gs(j -I- 1) — ge(j)) • 

(sz > limit + mrgn => 6) A (sz < limit — mrgn => -> 6) 

The output Ibin controls the bin selector. In order for the object to be placed 
in the correct bin, Ibin should have the correct value from time bin-limit after 
the end of the jth object (gs{j + 1)) through until the next object is detected 
(ge{j + 1)). We introduce the predicate ObjBin to abbreviate this condition. 

ObjBin(j) =(36: boolean • ObjSize(j, 6) A 

lbin^{^gs(j + 1) + bin-limit ... ge(j + 1)^) = {6}) 

The program is specified using a nonterminating specification command with a 
termination time, r, of infinity. 

oolbin: [r = oo A (V j :N«l<jAj-|-le dom gs => ObjBin(j))] (1) 




178 I. Hayes 



The main program code. Before going through the details of the refinement of 
the above specification, we give the final machine-independent program in Fig. 2. 
It makes use of a procedure Await (specified in Sect. 3.1) that waits for the beam 
to attain the value of its first parameter and returns an approximation to the 
time at which this occurs. The program makes use of the auxiliary variable i 
which counts the objects as they pass. The local variables st and et capture the 
start and finish times of the *th object (approximately), and the variable size is 
used to calculate the (approximate) size of the object from the time it took to 
pass and its velocity. If the calculated size is greater than or equal to limit then 
Ibin is set to true, otherwise it is set to false. It is assumed that the program 
starts when the current time, r, is at least MinS seconds before the first object 
passes through the beam. 




Await{false,\ gs{i), ge(i) 



B : st 

C : et Await{true, | ge{i), gs{i + 1) 
size := {et — st) * vet, 

Ibin : write{limit < size)', 

D : 



start of object at 
; — end of object 



deadline 


gs{i -b 1) + bin— limit 


i := i + 1 





ge.{i) 

at gs{i -b 1) 



]| 



od 



Fig. 2. Main program 



In addition to the expected standard code there are deadline commands and 
a number of uses of the auxiliary variable, i, and logical constants; these are 
highlighted within boxes. These are used to ensure that the operation of the 
program takes place in a timely fashion. No code needs to be generated for any 
of the highlighted constructs. Their purpose is to facilitate reasoning and to 
allow the specification of timing constraints via deadline commands. 









Real-Time Program Refinement Using Auxiliary Variables 1 79 



3.1 Specification of procedure Await 



The task of procedure Await is to wait until beam takes on the value of its first 
argument, val, and return in result pt (an approximation to) the time at which 
the value of beam changes to val. To allow simpler specification of the procedure, 
two auxiliary parameters are used: prev gives the (past) time at which beam 
previously changed, and event gives the (future) time of the awaited change. 
When Await is called, the time is after prev. The procedure may assume that 
beam is not equal to val from prev until event, and that once it changes to val 
it will remain equal to val for a time of at least err. If the value of beam never 
changes, then Await never returns. Otherwise it returns the result, pt, which is 
an approximation to event. 



const err = 1 ms ; {err < MinS A err < MinW A err * vel < mrgn{ ; 

procedure pt : time <r- Await(val : boolean; auxpren, event : Time) = 
prev < r A event = r = oo V 

oopt: beaml\(^prev ... event^) = {-i val} A , (event < oo A r < oo A 

beaml\(^ event ... event + err^) = {val} event < pt < event + err) 



The implementation of Await in Fig. 3 loops while testing the value of beam until 
it changes to equal val. The time is initially (and hence always) after prev. Hence 
when a value equal to val is read from beam, the time must be after event. The 
read must be completed before event + err in order to ensure that the procedure 
is not detecting some later change of beam to val. Hence the deadline after the 
read. If the value read is equal to val the loop terminates and one can deduce 
that event is less than or equal to the current time, r. The deadline after the 
gettime ensures that the value of pt is a close enough approximation to event. If 
the loop never terminates then for any time, t, there is a later time, t' , at which 
both the condition for repeating (p ^ val) and the loop invariant (the assertion 
just before the until) hold, and hence, t' < event + err, holds for arbitrarily 
large values of t'. Therefore event must be infinity. Note that if the repeat loop 
never terminates then the deadline after the loop is never reached and does not 
have to be considered. 



3.2 Refinement of the main program 

To refine the specification of the main program we make use of a refinement rule 
for introducing a possibly non-terminating loop developed in earlier work [3]. 
The rule allows for the case in which the loop may terminate, although in our 
example the main program loop never terminates. Reasoning about a loop makes 
use of a loop invariant, but in the case of non-termination, the reasoning is quite 
different to what we are used to with a terminating loop. The loop invariant is 
required to be idle-invariant, that is, invariant over the passage of time if the 
program state and outputs are stable. This is so that if the invariant holds before 
evaluation of the guard, it holds after evaluation of the guard. 




180 I. Hayes 



[ var p : boolean; 



aux before : Time', 



jprei) < r} ; 

repeat 
E : 



before := r; 



p : read (6eam); 



deadline event + err', 



\ {p = val => event < t) A (p ^ val 
l^r < event + err 
until p = val; 

I event < r} ; 
pt : gettime; 



before < event) A 



deadline event + err 



Fig. 3. Body of procedure Await 



Definition 6 (idle-invariant). A predicate P is idle-invariant provided, 

To < r A stable{p.out, fro ... rf) A P [r\ro] ^ P. 

Note that predicates of the form t < D (where D is idle-stable) are not idle- 
invariant, but predicates of the form D <t are. 

If the loop body terminates, we require that it re-establishes the invariant, I, 
and hence if the loop terminates one can deduce r<ooA-iBA7. If the loop 
body does not terminate then it must establish some other condition, R. In this 
case the whole loop establishes r = oo A iZ. If the loop does not terminate but 
the loop body terminates on every iteration, then we can deduce the following 
predicate: /qo = (V t : Time • (3 p. local, p. aux , t •t<r<DABA I)). A 
deadline, D, is included at the start of the loop body. This allows the extra 
condition that the current time is before D to be included in I^o, thus linking 
the invariant to the current time. 

Law 7 (loop with nonterminating body). Given an idle-stable, boolean- 
valued expression, B ; an idle-invariant predicate, I , not involving tq or initial 
(zero subscripted) variables; an time-valued expression, D; and a predicate R not 
involving tq or any local or auxiliary variables (including initial variables); then 

00 X'. [I, (r < 00 A -I B A 7) V (r = 00 A {loo V 7Z))] 

C doB deadline D ; 

oox'. [B A I A T < D, (r < 00 A 7) V (r = 00 A 7Z)] 

od 









Real-Time Program Refinement Using Auxiliary Variables 181 



The predicate R may not depend on final state variables because the values 
of these are not defined at time infinity, and it may not refer to tq or initial 
variables because R is used both in the specification, in which tq is the start 
time of the whole loop, and in the body of the loop, in which tq is the start time 
of an iteration. In order to refer to the start time of the whole loop within R it 
is necessary to introduce an explicit logical constant to stand for the start time. 

To apply this law to our example we introduce an auxiliary variable, i, which 
contains the number of the next object to be recognised. The next object breaks 
the beam at time ge{i). Hence this is a suitable deadline for the start of the 
loop body. The invariant states that all previous objects have been placed in the 
correct bin for their size. The most recent object, number * — 1, (if there was 
one) is special because the bin selector must be held at its current value until 
the start of the next object (within the next iteration). 

7 = 1 < * A gs{i) <T A i & dom A (V j : N • 1 < j < « — 1 => OhjBin{j)) 
A (* 1 => (3 6 : boolean • ObjSize{i — 1, 6) A 

lbin^(^gs{i) + bin-limit ... = {6} A lbin{r) = b))) 

The final /6m (r) = 6 allows for the case in which r has not yet reached gs{i) + 
bin-limit. The invariant is established by setting « to 1. 

If there are no more objects on the conveyor (ever) then the body of the loop 
will not terminate. In this case the body of the loop must achieve the goal of 
the main program. The predicate R strengthens the goal with the condition that 
there are only a finite number of objects on the conveyor. 

R = (3i ■. dom gs • i + 1 ^ dom gs) A 

(Vj :N«l<jAj-|-le dom gs => ObjBin(j)) 

For the loop in the example, the guard is the constant, true, and hence the 
postcondition in the specification in the law reduces to r = oo A {loo V R). This 
predicate must imply the goal of the main program specification (1). If R holds 
then this follows because R is & strengthening of the goal. If R does not hold, 
then it suffices to show 

(V i : dom gs • i + 1 G dom gs) A loo 
^ (V j :N«l<jAj-|-le dom gs => ObjBin(j)) 

This condition expands and simplifies to 

(V i : dom gs • i + 1 G dom gs) A 

(V t : Time • 

(3r : Time; i : N • t < r < ge{i) A 1 < * A gs{i) < r A i £ dom gs A 
{\f j :N»l<j<« — 1=> ObjBin(j)) A (i ^ 1 => . . .))) 

^ (Vj : N • 1 < j => ObjBin(j)) 

It suffices to show that for any positive j, that ObjBin(j) holds. To do this we 
choose t such that gs{j +2) < t. From the above condition there exist r and i 
such that 




182 I. Hayes 



and hence ObjBin{j) follows from the remainder of the above condition. 

After applying the law for introducing a nonterminating loop we are left with 
the following loop body. 

deadline ge{i)] 

ooi, Ibin: [/ A r < ge{i), (r < oo A 7) V (r = oo A R)] (2) 

We introduce three local variables, st, et and size to store the start and finish 
times of the next object on the conveyor and the size of the object, respectively, 
and refine the body of the loop via the introduction of sequential compositions, 
procedure calls, assignments, a write and a deadline command. The details of 
these steps are similar to standard refinement steps and are omitted here. 

4 Timing constraint analysis 

Internal to the proeedure. In order for compiled machine code to implement the 
machine-independent program it must guarantee to meet all the deadlines. The 
auxiliary variables and parameters introduced above aid this analysis. There are 
two deadlines within the procedure Await (Fig. 3). The deadline {F) within the 
repeat loop is reached initially from the entry to the procedure, and subse- 
quently on each iteration. We defer analysis of the entry path to the analysis of 
the main program, because the context of the main program is necessary for the 
analysis. For an iteration we consider the path (shown in Fig. 4) that starts at 
the assignment to before (E), reads the value of beam into p, passes through the 
deadline (F), loops back to the start of the repeat because p is not equal to val, 
performs the assignment to before (E), reads the value of beam, and reaches the 
deadline (F). The guard evaluation is represented by [p ^ val], which indicates 
that in order for the path to be followed, p must not be equal to val at that point 
in the path. (In refinement calculus terms [p ^ val] is a coercion [12].) The initial 



E : before := r; 

p : read (6eam); 

F : deadline event + err; 

|(p = val => event < t) A (p ^ val => before < event) A r < event + err} 
[P ^ val]; 

E : before := r; 

p : read (6eam); 

F : deadline event + err 



Fig. 4. Repetition path in Await 



time assigned to before, i.e., the time at which the path begins execution, must 






Real-Time Program Refinement Using Auxiliary Variables 1 83 



be before time event because the value of p was not equal to val, and the final 
deadline on the path is event + err. Hence, if the path is guaranteed to execute 
in less than time err, it will always meet its deadline. If this path is guaranteed 
to reach its deadline then any path with this as a suffix is also guaranteed to 
meet the final deadline. 

A similar analysis can be performed for the path exiting the repeat loop 
to the final deadline in Await. The path is the same as that in Fig. 4 except 
that it is extended past the deadline within the loop (F), exits the loop because 
p = val, and reads the current time into pt, before reaching the final deadline 
(G). The constraint on this path is also err. 

The main program. The analysis of the main program has to take into account 
deadlines within the procedure calls. There is a path that starts at (A) in Fig. 2. 
The path initialises i to 1, enters the loop, passes through the initial deadline, 
allocates the local variables st, et and size, makes the first call to Await (B), and 
within Await allocates and assigns the local and auxiliary variables correspond- 
ing to the formal value parameters, allocates the local variable p, extends the 
auxiliary variables with before, and follows the path into the repeat loop, ending 
at the first deadline (F) of event + err. The initial assertion guarantees the start 
time of the path is less than or equal to ge{l) — MinS. For this call to Await, 
event is ge{l) and hence the final deadline is ge{l) + err. Therefore a suitable 
constraint on the path is ge{l) + err — (ge{l) — MinS) = MinS + err = 41 ms . If 
this path is guaranteed to execute in a time of less that 41 ms then the deadline is 
guaranteed to be reached. The remaining timing paths are analysed in a similar 
manner. 

5 Conclusions 

This paper has examined the addition of auxiliary variables to a machine- 
independent real-time programming language for use in specifying and reason- 
ing about real-time programs, including timing constraint analysis. Auxiliary 
counter variables, such as i in the main program, can be used to relate the 
program state to the *th occurrence of an event. Auxiliary time variables can 
be used to keep track of times of relevant events. For example, auxiliary time 
parameters, prev and event in the procedure Await allowed the simplification of 
the specification of Await and allowed the expression of deadlines within the pro- 
cedure relative to the time of occurrence of external events. In all cases the use of 
auxiliary variables/parameters does not generate any code in the final program. 
They are purely used to assist reasoning and timing constraint specification and 
analysis. 

Aeknowledgements. This research was funded by Australian Research Council 
(ARC) Large Grant A49937045, Effeetive Real-Time Program Analysis. I would 
like to thank Colin Fidge, Karl Termer and Luke Wildman for feedback on earlier 
drafts of this paper, Brendan Mahony and Mark fitting for fruitful discussions 




184 I. Hayes 



on the topic of this paper, Andrew Lenart for his work on a summer project 
looking at auxiliary variables, and the members of IFIP Working Group 2.3 on 
Programming Methodology for feedback on this topic, especially Rick Hehner 
for his advice on how to simplify our approach. 



References 

[1] C. J. Fidge, I. J. Hayes, and G. Watson. The deadline command. lEE 
Proceedings — Software, 146(2):104-111, April 1999. 

[2] S. Grundon, I. J. Hayes, and C. J. Fidge. Timing constraint analysis. In C. Mc- 
Donald, editor. Computer Science ’98: Proc. 21st Australasian Computer Science 
Conf. (ACSC’98), Perth, 4-6 Feb., pages 575-586. Springer- Verlag, 1998. 

[3] I. J. Hayes. Reasoning about non-terminating loops using deadline commands. In 
Roland Backhouse and Jose Oliveira, editors. Mathematics of Program Construc- 
tion (MPC’2000), July 2000. 

[4] I. J. Hayes, C. J. Fidge, and K. Termer. Semantic identification of dead control- 
fiow paths. Technical Report 99-32, Software Verification Research Centre, The 
University of Queensland, October 1999. 

[5] I. J. Hayes and B. P. Mahony. Using units of measurement in formal specifications. 
Formal Aspects of Computing, 7(3):329-347, 1995. 

[6] I. J. Hayes and M. Utting. Coercing real-time refinement: A transmitter. In D. J. 
Duke and A. S. Evans, editors, BCS-FACS Northern Formal Methods Workshop 
(NFMW’96), Electronic Workshops in Computing. Springer Verlag, 1997. 

[7] I. J. Hayes and M. Utting. A sequential real-time refinement calculus. Technical 
Report UQ-SVRC-97-33, Software Verification Research Centre, The University 
of Queensland, URL http://svrc.it.uq.edu.au, 1997. 

[8] E. C. R. Hehner. Termination is timing. In J.L.A. van de Snepscheut, editor. 
Mathematics of Program Construction, volume 375 of Lecture Notes in Computer 
Science, pages 36-47. Springer- Verlag, June 1989. 

[9] E. C. R. Hehner. A Practical Theory of Programming. Springer Verlag, 1993. 

[10] J. Hooman and O. van Roosmalen. Formal design of real-time systems in a 
platform-independent way. Parallel and Distributed Computing Practices, 1(2):15- 
30, 1998. 

[11] Sung-Soo Lim, Young Hyun Bae, Gyu Tae Jang, Byung-Do Rhee, Sang Lyul Min, 
Chang Yun Park, Heonshik Shin, Kunsoo Park, Soo-Mook Moon, and Chong Sang 
Kim. An accurate worst case timing analysis for RISC processors. IEEE Trans, 
on Software Eng., 21(7):593-604, July 1995. 

[12] C. C. Morgan. Programming from Specifications. Prentice Hall, second edition, 
1994. 

[13] M. Utting and C. J. Fidge. A real-time refinement calculus that changes only 
time. In He Jifeng, editor, Proc. 7th BCS/FACS Refinement Workshop, Electronic 
Workshops in Computing. Springer, July 1996. URL http://www.springer.co.uk/ 
e WiC / Workshops / 7RW. html. 




On Refinement and Temporal Annotations * 



Ron van der Meyden^ and Yoram Moses^ 

^ School of Computer Science and Engineering 
The University of New South Wales, Sydney 2052, Australia 
meydenScse .unsw.edu. au 
^ Department of Electrical Engineering 
Technion, Haifa, Israel 
mosesOee .technion. ac . il 



Abstract. This paper introduces the semantics of a wide spectrum lan- 
guage with a rich compositional structure that is able to represent both 
temporal specifications and sequential programs. A key feature of the 
language is the ability to represent partial correctness annotations ex- 
pressed in temporal logic. A refinement relation is presented that enables 
refinement steps to make use of these partial correctness assertions. It 
is argued by means of an example that the approach presented allows 
for more flexible reasoning using temporal annotations than previous ap- 
proaches, and that the added flexibility has significant value for program 
optimization. 

Keywords: Refinement calculus, temporal logic, temporal refinement 
calculi 



1 Introduction 

Work on program refinement can be categorised into two classes. One of the 
most deeply explored approaches [Mor90,BvW98,Mor87] is state-based, premi- 
sed on the use of predicate transformers and weakest preconditions as a seman- 
tic basis. This is natural for programs whose specifications are descriptions of 
input/output relations, of which predicate transformers are a generalization. One 
of the advantages of this approach is the ability to write specifications containing 
constructs modelling two distinct types of annotation. Annotations of one type, 
which we call coercions, are used to state properties that the program is required 
to satisfy. Annotations of the other type, which we call assertions, are more like 
conventional program annotations in that they are used to state properties that 
the program has been proved to satisfy. 

Another class of refinement calculi, motivated directly by distributed pro- 
grams, is action-based [GS86,Hol89,HL95,Win86,Lam94]. Work in this category 
typically begins with a process calculus. In a distributed setting, specifications 
and reasoning very frequently need to refer not just to the current state, but 

* Work supported by an Australian Research Council Large Grant, and by the Tech- 
nion fund for advancement of research. Thanks to Kai Engelhardt for helpful dis- 
cussions on the topic of this paper. 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 185-202, 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 




186 



R. van der Meyden and Y. Moses 



also to past and future events in the system. Thus, this approach enriches the 
process algebra with features of modal logic, typically Hennessy-Milner logic or 
the more expressive /r-calculus. On the other hand, these works do not contain 
the assertions available in the state-based approaches. 

Our ultimate goal is to develop a framework for distributed programs that 
has both the rich compositional structure and temporal expressiveness of the 
process calculus based approaches and the ability to represent the two types of 
annotation available in the predicate transformer based approaches. Moreover, 
we aim for a high degree of flexibility in the use of assertions: given the temporal 
setting, we would like to be able to reason about, and refine, any program frag- 
ment based on properties established by any other program fragment, both past 
and future. Our contribution in this paper is to develop such a framework for 
sequential programs. There exist prior proposals [UF96] combining the expres- 
sive power of temporal formalisms and the annotational expressiveness of the 
predicate transformer based approaches to refinement, but we argue that our 
framework allows for some desirable modes of reasoning not available in these 
proposals. In particular, we show by means of an example (in Section 8) that 
it is possible in our framework to move annotations within the program in a 
manner that is not supported by other formalisms. 

2 Syntax 

The setting we are considering is one in which there is a single agent that per- 
forms actions, which in turn modify the state. We assume as given an environ- 
ment, consisting of a tuple {S, A, where ^ is a set of basic actions, S is 

a set of states, RCSxAxS is a transition relation describing the effect of 
actions on the state, and tt : Prop — >■ J’(S') is an interpretation, mapping each 
propositional constant in some set Prop to a set of states. The environment will 
act as an implicit parameter in what follows. 

Starting from a logical language L of formulas, which we shall define shortly, 
the set A of basic program actions, a set PV of program variables and a set CV 
of constraint variables, we define the class PT of program templates as follows. 

P::=e \ a \ Z \ \ | {p}j \ P-P \ P + P \ P^ 

where e is a distinguished symbol representing the empty program, a G A is a 
basic action, Z G PV is a program variable, ip and 'ip are formulae of H, X G CV 
is a constraint variable, and J C CV is a set of constraint variables. A program 
template containing no program variables will be called simply a program. The 
program [ip]^ is called a coercion, while is called a specification. We call 

these constructs constraints. A program of the form {p}j is called an assertion. 
In contrast to coercions, an assertion does not constrain the execution. It is a 
statement, akin to a comment in the program text, whose truth depends on the 
rest of the program. The subscript J is called the justification of the assertion, 
and it keeps track of what constraints in the current program this assertion 
depends on. Intuitively, a program is called valid if the assertions in the program 




On Refinement and Temporal Annotations 



187 



are all guaranteed to be satisfied. We will present a detailed motivation for the 
constraint variables and justifications in Section 5. 

The operator ; represents sequential execution of programs. The operators 
+, representing nondeterministic choice, and representing finite or infinite 
repetition, are nondeterministic. Together with coercions, they can be used to 
define the standard deterministic branching and looping operators if- then - 
else and while. We will use the following definitions of these constructs. 

— if^i^ then P else Q will stand for ([i^]^;P) -|- Q), and 

— while^i^ do P will stand for P)“; 

To define constraints and assertions, we use a modal logical language £j. For 
the sake of brevity, we use a simple version of propositional linear-time temporal 
logic. This language can be extended considerably without losing the properties 
we shall discuss. Given a set Prop of primitive propositions, the formulas of our 
language L are inductively defined as follows: 

Every proposition in Prop is a formula, and if (pi and tp 2 are formulas then 
so are -'‘Pi, ipi A (p 2 , and We define true as -■(P A ~'P), false as ->true, 
ipi V <p 2 as A “"^ 2 )) Pi — >■ P 2 as V p 2 i and <>p as as usual. 



3 Semantics 

3.1 Semantics of Formulas 

Semantically, formulas of the modal language describe what is true at time 
points of the executions of programs in the environment (5, A, P, tt). A run over 
(S', A, R, tt) is defined by a pair r = {h, a) where : N — >• S and a : N — >• A such 
that {h{n),a{n), h{n+l)) € R for all n G N. The h component is a state history, 
describing the sequence of states the system goes through. The component a is 
the action history, describing the sequence of actions taken. Intuitively, we think 
of h{k) as the state at time k, and a{k) as the action performed at time k. A 
point is a pair (r, k) where r is a run and fc G N represents a time. 

Formulas are said to be true or false at a point (r, k) . The fact that p is true 
at (r, k) is denoted by r, fc ^ p, and is defined by induction on the structure of 
formulas as follows (here and elsewhere, we leave the environment implicit): 

1. r,k \= p for p G Prop if h{k) G tt{p), where r = {h, a). 

2. r,k \= -<p if r, A: ^ p. 

3. r,k \= Pi A p 2 if both r,k \= pi and r,k \= p 2 - 

4. r,k\= Up if r,k' \= p holds for all k' > k. 

3.2 Semantics of Programs 

We will develop the complete meaning of the program constructs in a number of 
steps. We begin in this section by presenting a semantics of programs that ignores 
the role of assertions and constraint variables: we will introduce the semantic 




188 



R. van der Meyden and Y. Moses 



notions related to these once we have motivated them from some requirements 
on our approach to refinement d 

A time interval is a pair [c, d\ where c and d are elements of N+ = N U { 00 } 
with c < d. A run interval is a pair r, [c, d] consisting of a run r and a time 
interval [c, d]. We will refer simply to “intervals” when the specific type of interval 
is clear from the context. We give semantics to programs in an operational style, 
by defining when an execution of a program P occurs over an interval [c, d] in a 
run r, which we denote by r, [c, d] Ih P. 

An execution tree for a program P is an ordered tree representing a possible 
execution of the program. Intuitively, such a tree describes one particular way in 
which the nondeterministic choices that may be made in running the program are 
resolved. The left to right ordering of nodes in a tree corresponds to precedence 
in time. Formally, an execution tree will be a finite or infinite tree T in which 
each node is adorned by a program template, subject to the following conditions 
on the nodes n of T: 

1. If n is adorned by e, by a program variable, a constraint, an assertion or a 
basic action, then n is a leaf. 

2. If n is adorned by P; Q then n has exactly two children, the leftmost adorned 
by P, the other adorned by Q. 

3. If n is adorned hy P + Q then n has exactly one child, adorned by either P 
or Q. 

4. If n is adorned by P“ then n has exactly one child, adorned either by e or 
by P;(P-). 

An execution tree for a program template P is an execution tree whose root is 
adorned by P. We write £(P) for the set of execution trees of P. 

Let T be an execution tree and r = (ft,, a) a run. We say that a mapping 9 
associating an interval with every node of T is an embedding of T in the inter- 
val r, [c, d] if the following conditions are satisfied: 

1. If n is the root of T then 9{n) = [c, d]. 

2. If n has a single child m then 9{n) = 9{m). 

3. If n has exactly two children m\ and m 2 then there exists e < f < g such 
that 6{mi) = [e, /] and 9{m2) = [f,g] and 9{n) = [e,g]. 

4. If n is adorned by the empty program e and 9(n) = [e, /] then e = /. 

5. If n is adorned by an assertion {v?}j and 9{n) = [e, f] then e = /. 

6. If n is adorned by a basic action a and 9(n) = [e, /] then either e = / = 00 
or e < 00 , / = e -I- 1, and a{e) = a. 

7. If n is adorned by a coercion [ 1 ^]'’'^ and 6{n) = [e, f] then e = / and if e < 00 
then r,e \= (fi. 

8. If n is adorned by a specification and 9{n) = [e,f] then if e < 00 and 

r,e \= (fi then / < 00 and r, f \= -ip. 

^ Viewed in isolation, the definitions of this section could be given a simpler presen- 
tation: we go into the complexities of execution trees to prepare us for the later 
definition of validity. 




On Refinement and Temporal Annotations 



189 



9. If 0(n) = [e, /] then / is the least upper bound of the set of g € N+ such 
that 9{m) = [g' ,g] for some leaf node m descended from n. 

Notice that we made no special requirement about how 9 should map a 
leaf adorned by a program variable Z G PV . This is intentional, since in our 
view a program variable may be replaced by an arbitrary program, leading to 
an arbitrary mapping. In this sense, a program variable is treated just like a 
specification of the form [false, true]. 

To understand the need for condition (9) of the definition of embedding, let 
P be the program ([true]^)“ and consider the infinite tree T G £(T’) depicted 
in Figure 1. (We omit the constraint variable X in this figure.) This tree depicts 
a computation in which a coercion is repeated infinitely often. Intuitively, as a 
coercion takes no time, this computation should take no time either. Suppose 
9 is an embedding of T into r, [c,d]. Thus, 0(root(T)) = [c, cf|. It follows from 
conditions (2), (3) and (7) of the definition of embedding that 9{n) = [c, c] for all 
leaves n of T (each of which is adorned by [true]^). However, this still leaves 
open the possibility that c < d, which conflicts both with the intuition that the 
execution should take no time and with the intuition that the transition from 
time c to time d is effected by executing the sequence of statements represented 
at the leaves of the tree from left to right. 



[(rac]“ 

[true] ; [true]^ 

[true] [true] “ 

[true] ; [true]^ 
[true] [/rae]“ 



[true] ; [true]^ 



[true] [lrae]“ 



Fig. 1. A tree requiring a limit condition on embeddings 



We write r, [c,d] \\- t ,0 P if T G £{P) and 9 is an embedding of T in the 
interval r, [c, d] . Intuitively, this means that T and 9 describe an execution of P 
in this interval. We say that the program P occurs over r, [c, d], denoted by 
r, [c, d] Ih P, if there exist T and 9 such that r, [c, d] \\~T,e P- 

Define a program to be concrete if it does not contain assertions or speci- 
fications, it contains coercions only within if and while statements, and the 




190 



R. van der Meyden and Y. Moses 



formulas in these coercions contain no temporal operators. One can give a stan- 
dard operational semantics to concrete programs, and establish its equivalence 
to the semantics presented above. We omit the details for reasons of space. 

4 Refinement 

In this section we consider some desiderata for the notion of program refinement 
for our framework. We propose some definitions in this section that are appro- 
priate for programs not containing assertions. We will later need to adjust these 
definitions in order to accommodate assertions. 

The notion of refinement we seek will be represented by means of a binary 
relation < on programs, such that P < Q when P is a refinement of Q. Intuitively, 
this means that the program P has less nondeterminism than Q, in the sense 
that every execution of P is an execution of Q. Recalling that our programs also 
function as specifications, an alternate way of phrasing this is that P carries more 
information, or is more constraining than Q. In order to support a reasonable 
calculus for refinement, we have the following desiderata for this refinement 
relation: it should be a reflexive, transitive relation satisfying 

Monotonicity: If P < Q and C{Z) is a program template then C(P) < C(Q).^ 

Reduction of Nondeterminism: P < P + Q and Q < P + Q 
Increase of Information: [tp]^ < if p ^ ip is valid. 

We will say that a refinement relation is adequate if it satisfies the above condi- 
tions for programs not containing assertions. 

Define a refinement relation C by having P C Q if for all intervals r, [c, d] , if 
r, [c, d] Ih P then r, [c, d] Ih Q. We write P = Q when P Q Q and Q Q P. It is 
not difficult to show that this definition satisfies all the properties above: 

Theorem 1 The binary relation C on program templates is adequate. 

Example 1. Consider a program that is to access a shared resource such as a printer, 
which is guarded by a locking mechanism. Only an application that has a lock on the 
resource can use it. Moreover, we wish to ensure that the program does not hold on to 
the lock indefinitely. We model this by a proposition have_lock which is defined to be 
true when the application has obtained a lock on the resource, and three (terminating) 
actions: get_lock, release_lock, and use_resource, corresponding to the action of fetching 
the lock, releasing the lock, and nsing the resonrce. Given an appropriate definition of 
the environment (omitted), the first two actions have the property that 

get_lock IZ [true, have _lock]*^ and release_lock C [true, -ihave_lock]^ 

for any constraint variable U. Moreover, 

use_resource Cl [have_lock, have_lock]^ 

^ We use the convention that if C{Z) is a program template containing the program 
variable Z then C{P) is the program template obtained by replacing every occurrence 
of Z in C{Z) by P. 




On Refinement and Temporal Annotations 



191 



so that if started in a state where the application has the lock, the use of the resource 
will terminate in finite time, and does not relinquish the lock. 

The restrictions on the use of the resource by the application can be guaranteed by 
requiring that the resource can only be used via a legal access procedure, defined by: 

Access{X,Y) = [true, have_lock]^ ; use_resource ; [true, 0-ihave_lock]^ 

Note that this allows that after using the resource the program need not release the 
lock immediately. However, it must ensure that the lock is eventually released. In a 
followup example in Section 8 we will demonstrate how this is possible. We now turn 
to implementing Access{X,Y). From the fact that -ihave_lock — >■ 0-ihave_lock is a 
valid formula and the assumption that releaseJock C [true, -ihave_lock[*^ we can 
show that releaseJock C [true, 0-ihave_lock[*^. Monotonicity of Y now yields 

(get_lock ; use_resource ; release_lock) Y Access{X,Y), 
giving us one straightforward way of implementing Access. 

The following example illustrates a sense in which our sequential framework 
is already able to capture some aspects of a distributed setting. 

Example 2. The previous example treats the action of obtaining the lock as taking 
a single time step. This makes sense if time steps represent state transitions for the 
program we are developing, but if the program operates concurrently with other pro- 
grams that may hold a lock on the resource, then we may wish to reason about the 
period during which the program waits for a lock held by another process to be re- 
leased. This may be done by introducing another proposition have_locke to represent 
that the environment of the program holds the lock, and an atomic action lock such 
that lock Y |-ihave_locke, have_lock[^, as well as an atomic action skip that does 
not change the value of have_lock but may switch the value of have_locke. We may 
then implement the specification [true, have_lock[*^ by the refinement 

□ (have_locke — >■ 0^have_locke)[^; (while^have_locke do skip); 
lock Y [true, have_lock]^. 

Note that the coercion in this program states that the environment of the program is 
required to eventually release any lock it may hold — this ensures eventual termination 
of the while loop and success of the lock action. (We could eliminate the need for such 
coercions by generalizing environments to include liveness conditions, but we will not 
pursue this here.) 



5 Reasoning with Assertions 

In deriving programs by refinement, it is frequently the case that one part of the 
program being developed ensures conditions that can be used to optimize other 
parts of the program. To take advantage of such opportunities for optimization, 
it is useful to have in a refinement calculus a facility for making assertions that 
are derivable from program fragments, and rules that allow such assertions to 
be exploited in refinement steps. The assertions {<f)}c in our framework are in- 
tended to play such a role. A further desideratum for our framework is that it 




192 



R. van der Meyden and Y. Moses 



should support reasoning about the temporal properties guaranteed by a pro- 
gram fragment based on the properties of program fragments that will run in the 
future. (We present an example of such reasoning later.) We now explain how 
this desideratum motivates our use of constraint variables and justifications. We 
suppose for the sake of the argument that constraints and assertions do not have 
these annotations. 

Consider the program [true, </>]. When this program terminates (as it must) (j) 
will be true. Thus, for a refinement relation < that supports the introduction of 
assertions into the program text, we would like the rule [true, (j)]; {(/>} < [true, (j)] 
to be sound. Intuitively, {</>} asserts that </> is guaranteed to hold at this location. 
Moreover, we would like to be able to exploit assertions to perform refinements. 
When [true, ^]; {(()} occurs within the context of a larger program, the reason 
for the assertion {</>} may be some following part of the program that guarantees 
that 4> will hold at this location. (For example, (p might assert that a message 
will eventually be delivered, and the following code may send this message on a 
reliable channel.) In this situation, we would like to be able to use this fact to 
simplify the specification [true, p] to e. Intuitively, this program fragment need 
do nothing, since its intended effect is taken care of by some other part of the 
program. This suggests that we want the rule e < [true,(()]; {(/>} to be sound. 

Of course, we cannot have both these rules, since we would obtain by transi- 
tivity that e < [true,;/)], which clearly cannot be sound. What has gone wrong 
is that we have attempted to refine a program fragment based on an assertion 
derived from that very program fragment. This is circular reasoning! To block 
such circularities, we keep track of the constraints that the truth of an asser- 
tion depends on, and allow refinement steps of this type only when they do not 
involve circular reasoning. We do this by labeling constraints by constraint va- 
riables, providing assertions with justifications, and carefully tracking the way 
that assertions depend on constraints by appropriately adjusting the justifica- 
tions when refinement steps are performed. Intuitively, an assertion in a 

program amounts to a claim that (p holds, with the proof depending on the fact 
that the requirements implied by the constraints associated with all X G J are 
satisfied. For example, if Y G J and occurs in the program, then the 

proof of (fi may depend on the segment of code that the designer ultimately sub- 
stitutes for having the property that if started in a state satisfying ^pi, 

it will terminate in a state satisfying ip 2 - 

Using this idea, the two refinements above become [true, ;/)]^; {^}{x} < 
[true, 0]^, and e < [true, {4>}j provided X ^ J. The latter still allows us 
to perform the optimization when the assertion is justified by some other part 
of the program, while blocking the undesirable refinement above. 

The decision to track justifications of assertions and allow rules such as 

“e < [true, {c/iIj provided X ^ T' 

leads to some further complexities. Note that this rule eliminates the constraint 
[true, 4>]^ ■ The larger program within which this refinement occurs may contain 
assertions that depend on this constraint. These assertions are still valid, but 




On Refinement and Temporal Annotations 



193 



the reason for the validity is now that e runs in place of [true,(/)]^, and that 
this constitutes a correct implementation of [true, (j)]^ because of the condition (p 
guaranteed by the constraints in J. That is, J becomes part of the explanation for 
assertions that previously depended on X. This means that in applying this rule, 
we need to transform the justifications for assertions elsewhere in the program. 

To formalize these transformations, define a justification transformation to 
be a mapping 77 : ‘?{CV) — 1 'J’{CV) that is increasing, i.e., satisfies J C rj{J) for 
all J C CV. The result of applying a justification transformation rj to a, program 
P (or, respectively, program template C{Z)), is the program Pri (respectively, 
the program template Crj{Z)) obtained by replacing every assertion in P 

(respectively, in C), by the corresponding assertion {(p}rj(j) in which its justifi- 
cation set has been transformed by rj. We write Crj(P) for the program obtained 
by substituting P for Z in Crj{Z). 

The assumption that justification transformations are increasing arises from 
the fact that we allow more than one constraint to be associated with the same 
constraint variable. (Programs in which this is the case arise naturally from 
desirable program transformations, such as refining P“ by This trans- 

formation would create copies of any constraints in P, resulting in multiple 
occurrences of constraint variables.) Consider a program such as 

{twice{g})}{x} ; ; W}{y} ; ; a ; 

where a is any atomic action and twice (p) expresses that there are at least two 
distinct time points in the future at which p holds. This program is valid. If we 
were to apply the coercion elimination rule discussed above, and apply a justi- 
fication transformation rj mapping {X} to {P}, we would obtain the program 

P = {twice{p)}{Y} ; [p]^ ; e ; a ; [p]^ 

This program is not valid, because it states {P} as the justification set for the 
truth of the assertion twice{p). The program P^^\ which is obtained from P 
by replacing the coercion \p\^ by e has executions in which p holds only once 
and twice{p) is falsified. A sound refinement step would have stated that the 
assertion twice (p) depends on both X and Y. Hence, the appropriate justification 
transformation here is to map {X} to {X, P}; the elimination of a coercion [p]^ 
introduced a dependency on Y , but additional dependency on X still needs to 
be accounted for. More generally, when transforming justifications in refining a 
constraint with label X, we need to preserve instances of X, since occurrences of 
X in assertions may be due to X-constraints other than the one being refined. 

It is convenient to introduce some notation for particular justification trans- 
formations that will occur frequently in what follows. The identity justification 
transformation, in which rj{J) = J for every J C CV , is denoted by t. We 
will also represent justification transformations using expressions of the form 
[Xi ^ Si,... ,Xn S'„], where the Xi G CV are constraint variables and 
the Si C CV are sets of constraint variables. (In the simple case of n = 1, 
we write X ^ S instead of [X ^ S'], and if S is a singleton {Y} we write 
X Y .) Such an expression denotes the justification transformation rj defined 




194 



R. van der Meyden and Y. Moses 



by v{J) = •/ U Ujf e n J C CV. It is also useful to talk about the 

composition of justification transformations. We define the composition 77 • 77' by 
77 • rj'{J) = rj{r]'{J)) for all J C CV. 

6 Validity and Valid Refinement 

We now provide a formal semantics for the ideas motivated in the previous 
section. 

6.1 Validity 

The semantics of assertions can be made precise as follows. First, to help 
capture the dependency of the assertion on J, we define a program transforma- 
tion that modifies a program to one in which the program structure is main- 
tained, but only the constraints labelled by a variable in a given justification 
set are enforced. Let J be a justification set and P a program. We define the 
program by a recursion on the structure of P. In the cases where P is the 
null program e, an atomic action a, a program variable Z, or an assertion 
we define P'^ = P. The other base cases concern program constraints. In the 
case of coercions, we define to be \p]^ if V G J and e otherwise. That 

is, if the constraint variable labeling a coercion is in the set J, then the coercion 
is preserved, otherwise it is eliminated. Note that e and the coercion it replaces 
are programs requiring no time to execute, so the temporal behaviour of the pro- 
gram is not modified by this substitution. In the case of specifications, we define 
([(p, 7/)]^)'^ to be if V G J and [false, true]^ otherwise. Again, if the 

constraint variable X labeling the specification is in the set J, then the specifi- 
cation is preserved, li X ^ J, then the specification is replaced by [false, true]^, 
which is the specification that does not impose any requirements. Intuitively, this 
specification may run for any finite or infinite amount of time, performing an 
arbitrary sequence of actions. More formally, it can be seen from the definition 
of occurrence that r, [c, d\ Ih [false, true]^ holds for every interval r, [c, d]. 

The recursive cases of the definition of P^ are given by: (i) (P; QY = P"^; Q'^ , 
(ii) {P+Q)^ = P^ and (iii) (P^)'^ = {P'^Y . Intuitively, these cases preserve 
the structure of the program and filter the transformation down to the base cases. 

Notice that in moving from P to P^ we are coarsening the program: 

Lemma 1. For every program P and justification set J , PC Pfi 

We define a program P to be valid if the following condition holds: for every 
assertion {p}j appearing in P, for all r, [c, d], T, 9, n and e, if (i) r, [c, d] \\-T,e P'\ 
(ii) 77 is a node of T adorned by {p}j and (iii) 6{n) = [e, e] where e < 00, then 

r,eYP- 

Intuitively, this condition holds when the assertion p holds at each point of 
an execution of the program P^ that corresponds to a location in the text of 
P*^ at which the assertion {p}j occurs. Note that we check assertions only at 
finite times, and do not require for validity that all assertions of the program be 




On Refinement and Temporal Annotations 



195 



reached during the execution. In this respect, our notion of validity resembles 
partial correctness of program assertions [Hoa67]. 

Assuming the standard semantics for the assignment statement and the ob- 
vious interpretations of the propositions, the following is an example of a valid 
program: 

Pi = [x = 0]^ ; {0(y > !)}{x} ; y ^ x + 1 ] y ^ y + 1 ] {y = 2 }{x} 

The justification of the assertions in this program is the singleton set {AT}. 
The truth of the assertions in this program depends on more than just the 
coercion [x = 0]^, but the other parts it depends on are concrete program 
segments, which cannot be modified at a later stage of a refinement process. 

Program P2 is another valid program, illustrating the sense in which validity 
represents a notion of partial correctness: 

P2 = [x = 0 ]^ ; (while^ true do a; 1— a; -I- 1) ; {falsej^yj 

To see that this program is valid, notice that {false}{y} is the only assertion 

in P2 ■ In all execution trees of P2 ‘ that can be embedded in an interval r, [c, d\ , 
the node adorned by this assertion is mapped to infinity, because it is preceded 
by infinitely many time steps in which x is incremented. The assertion is thus 
vacuously satisfied at all finite points that its nodes are mapped to, and it follows 
that P2 is valid. Notice that this example also shows that the general problem of 
deciding whether a program is valid is at least as hard as proving nontermination 
of programs in our language. 

The slight variant [x = 0]^ ; (while^ true do a: x -I- 1) ; {falsej^jcj 

is not a valid program, however. 



6.2 Valid Refinement 

We are now ready to define a validity-preserving notion of refinement. As noted 
above, we need to apply a justification transformation 77 to the surrounding 
context in some cases. We will write P ^rj Q to state that a program template P 
validly refines a program template Q, subject to a justification transformation 77 
to be applied to the syntactic context in which the refinement is to be applied. 
We define P Q to hold if for all program templates C{Z), if C{Q) is valid 
then (i) Crj{P) C C'(Q), and (ii) Crj{P) is valid. Notice that the transformation 77 
affects only the form of assertions, which are effectively ignored by the refinement 
relation C. As a result, P = P77 and C'(P) = Cr]{P) hold for every C, P, and rj. 
Thus, condition (i) in the definition of valid refinement can equivalently be stated 
as (i’) C(P) C C{Q). Moreover, since C is monotone, we can often obtain (i’) 
from P Q Q. As we will see, however, there are important cases in which P Q 
in cases where P % Q. 

Clearly, for every program template C{Z), ii P Q and C{Q) is valid 
then C(P) C C{Q). Thus, P Q states in part that it is possible to obtain a 
refinement of any valid program by locally substituting P for Q. In addition to 




196 



R. van der Meyden and Y. Moses 



this, the definition requires that this substitution be validity preserving, provided 
we transform the justifications in the context by 77 . The following result shows 
that this relation has the transitivity and monotonicity properties, provided that 
we make appropriate allowance for the justification transformations. 

Theorem 2 The relations <ri satisfy the following: 

1- If P dirj Q and Q <ri' R then P dirj-ri' R- 

2- If P dir] Q then Cr]{P) dr] C{Q) for all program templates C{X). 

An easy corollary of Theorem 2 is that dr is an adequate refinement relation. 
We write P d Q lor P dr Q, and write P Q when both P dr Q and Q dr P- 
The following example illustrates how assertions combine with the notion of 
valid refinement to enable refinement steps that exploit properties of the context 
in which the refinement takes place. 

Example 3. In example 2 , we refined a specification to a concrete program together 
with a coercion that states a property that the environment must have for this con- 
crete program to implement the specification. We may restate this example using valid 
refinement as follows: 

(while^have_locke do skip); lock ^ {□(have_locke — >■ 0-ihave_locke)} j; 
[true, have_lock]^ 

provided X ^ J. Intuitively, this states that the refinement is valid provided it is done 
in a context in which one can prove that the environment is guaranteed to eventually 
release the lock. 



7 Rules for Valid Refinement 

We now present a number of refinement rules sound with respect to the above 
semantics. (This is not a complete list of rules for our semantics.) First, we have 
a rule which enables one of the most basic steps used in top-down design of 
sequential programs (cf. [Mor90]): 

Sequential composition: [(^i,t/>]^ du^{x,Y} 

Many of the properties of C translate into valid refinements with the identity 
transformation : 

Identity: e; P x P and P; e x P 

‘^-rules: P“ x (e-hP;P“) and P“;P“ x P“ 

Associativity: {P;Q);R x P;{Q;R) and {P + Q) + R x P+{Q + R) 

Commutativity and Idempotence of-|-: P+Q x Q+P and P-t-P x P 

Distribution: P;{Q + R) x {P;Q) + {P; R) and {Q + R);P x (Q;P)-|- 
(P;P) 




On Refinement and Temporal Annotations 



197 



We say that a program is vanishing if it is the empty program e, a coer- 
cion or an assertion {g}}j. These programs are called vanishing because 
they occur over point intervals and take no time to execute. The rules for vanis- 
hing programs do not hold in the formalism of Morgan [Mor90], for example. 

Commutativity and Idempotence of vanishing programs: for all vanishing 

P, Q, we have P\Q x Q\P and P;P‘^ x P 

The examples just given demonstrate standard refinements that still hold as 
valid refinements. Not every standard refinement translates into a valid refine- 
ment. For example, we clearly have {false} C e but not {false} e, since for 
the template C{Z) = Z we have that C{e) is a valid program, while (^({false}) 
is not. But even for programs that do not contain assertions, one needs to be 
careful. For example, we have \ip\^ C Vp]^ ■ This does correspond to a valid 
refinement, but one in which anything that depends on Y should be made to 
depend on both X and Y . Thus, we have the valid refinement rule 

Renaming Constraint: and 

A major reason for using valid refinement is for proving that P :<ri Q in cases 
when P Y Q does not hold. Intuitively, this will be the case for refinements that 
exploit properties of the context in which the refinement step is taking place. 
Examples of this are: 

Coercion Elimination: e dix^j {p}j] Vp]^ provided X ^ J 

Specification Elimination: e and e {ip 

ip}j ; [(/?, ^/>]^, provided X ^ J 

Of special interest are rules that allow us to move temporal assertions around 
the program text. One example is 

Advance Box: P;{D(p}j A {np}j;P 

which moves a temporal assertion forward in time. It is also possible to move 
temporal assertions backwards. 

Regress Diamond: If P [true,true]^ then {Oy>}jiuJ 2 ; P :< P\{Op}j^ 

We remark that the statement P [true,true]^ ensures that P is a 

halting program, and so it provides one approach to expressing termination. 

A few additional properties of P that we shall use in the following example are 

Strengthen Spec: li ^ p and rp ^ rpi are valid formulae of the logic 

of C then ^ 

Specification Consequence: ] {^}ju{x} ^ i 

Valid Assertion: If (p is a valid formula of the logic of C then {v ?}0 c- 

As mentioned above, some of our rules are fairly standard, and appear in 
other refinement calculi. In the example presented in the next section we make 
effective use of two rules which do not hold in typical systems: The Regress 
Diamond and Specification Elimination rules. 




198 



R. van der Meyden and Y. Moses 



8 Optimization Using Refinement 

We now consider an example that illustrates some of the rules mentioned above 
and shows that our framework differs from others in useful ways. In the setting 
of Example 1 from Section 4, consider a case in which the resource needs to be 
used twice in a row. Thus, our goal is to implement the program Access-twice 
defined by 



AccessAwice = Access{Xl,Yl) ; Access (X2,Y2). 

Recall from Example 1 that 

Access{X2,Y2) = [true, have_lock]^^ ; use_resource ; [true, 0-ihave_lock]^^. 

In Example 1 we argued that 

(getJock ; use_resource ; release_lock) C Access{X2,Y2), 

A similar valid refinement can be established: 

(getJock ; use_resource ; release_lock) ^ Access{X2,Y2). 

Indeed, one way to implement AccessAwice would be to simply repeat this pro- 
gram twice in sequence. It is possible to do better, however. We now demonstrate 
how, with the aid of valid refinement, our framework can be used to obtain a 
more efficient implementation. 

Let us first consider the final step of Access{X2,Y2), i.e., the program 
[true, 0-ihave_lock]^^. Note that by using Identity and Valid Assertion (since 
true is a valid formula) we can obtain 

{true} 0 ;[true, 0 -ihave_lock]^^ ^ e;[true, 0 -ihave_lock]^^ ^ [true, 0 -ihave_lock]^^, 

and the Specification Consequence rule yields 

[true, 0->have_lock]^^; { 0 ->have_lock}{y 2 } ^ [true, 0-ihave_lock]^^. 

By Monotonicity we thus have that 

Access(A2, V2); {0->have_lock}{y2} ^ Access{X2,Y2). 

One of the strong points of our framework is that we can now use our rules 
to move the temporal assertion to the beginning of Access{X2, Y2)\ The idea is 
to apply the Regress Diamond rule three times and thereby move the assertion 
back over each of the components of Access{X2,Y2). 

Notice that by Strengthen Spec we have^ 

[true, 0->have_lock]^^ ^ [true, true] 

® Strictly speaking, applying the rule we should write Ay 2 ^y 2 in the valid refinement, 
but it is easy to check that the justification transformation Y2 ^ Y2 is the identity 
transformation. 




On Refinement and Temporal Annotations 199 

Hence, by Regress Diamond we have that 

{ 0 ->have_lock}{y 2 }; [true, 0-ihave_lock]^^ A [true, 0-ihave_lock]^^; 

{0-ihave_lock}{y2}) 

so we can move the temporal assertion back over the last component of 
Access{X2,Y2). Since use_resource is a basic (terminating) action, we have 
use_resource A [true, true]^^. Thus, again by Regress Diamond we can derive 

{ 0 ->have_lock}{y 2 }; use_resource A use_resource; { 0 -'have_lock}{y 2 }) 

Finally, we again apply Strengthen Spec to obtain that 

[true, have_lock]^^ A [true, true]^^ 



and hence 

{0-ihave_lock}{x2,V2};[true, have_lock]^^ ^ [true, have_lock]'’^^;{ 0 -'have_lock}{F 2 }. 

The end result of moving the temporal assertion back over these three clauses 
yields: 



{<>-'have_lock}{x 2 .y 2 }; ^ccess(A12, T2) A Access(X2,Y2). 

By monotonicity, it follows that 

Access(Xl, Y 1); {0->have_lock}{x2,y2}; Access{X2,Y2) A Access-twice. 

Let us now focus on the interaction of the assertion in this program with the 
final statement in Access{Xl,Yl). Since FI ^ {X2,Y2}, we have by Specifica- 
tion Elimination that 

e ^Fi.->.{X2,y2} [true, 0 -ihave_lock]^^; {0-ihave_lock}{x2,Y2} 

so that we can eliminate the third clause in Access (XI, FI) and derive 

[true, have_lock]^^ ; use.resource ; Access(X 2 ,Y 2 ) diYi^{X2,Y2} Access-twice. 

One further improvement is now possible. As we did for the final step of 
Access(X2,Y2), we can apply the Identity, Valid Assertion and Specification 
Consequence rules to [true, have_lock]^^ — the first step of Access(Xl, Y 1), and 
obtain: 



[true, have_lock]^^; {have_lock}{xi} ^ [true, have_lock]^^. 

With an additional application of Specification Consequence, using the property 
use_resource A [have_lock, have_lock]^^ 



which was one of the given assumptions, we can derive 




200 



R. van der Meyden and Y. Moses 



[true, have_lock]^^; use_resource; {have_lock}{xi} d [true,have_lock]^^; use_resource. 

We can now apply the assertion {have_lock}{xi} to the first statement in 
Access{X2,Y2), using Specification Elimination to obtain 

e diX 2 ^xi {have_lock}{xi}; [true, have_lock]^^, 

and we can use this to derive that 

[true, have_lock]^^; use_resource; use_resource ; [true, 0-ihave_lock[^^ AccessAwice 

where r] = [X2 ^ XI] • [Yl {X2,Y2}] = [X2 XI, Y1 {X1,X2,Y2}]. 
The first and last steps of this program may be refined to get.lock and release.lock, 
respectively, as before, yielding a concrete implementation of AccessAwice: 

get.lock ; use_resource ; use_resource ; releaseJock Access-twice. 

Note that we have achieved an interesting optimization by means of this rea- 
soning: instead of releasing the lock after the first use and then re-acquiring 
it immediately, we simply hold onto the lock and use the resource twice. This 
optimization made essential use of valid refinement. Moreover, moving temporal 
assertions (both forward and backward) in the program text played an important 
role. 

9 Conclusion 

As discussed in the introduction, there exist many approaches to program refi- 
nement. Perhaps the work most closely related to ours is the work on real-time 
refinement originated by Utting and Fidge [UF96,UF97,HU97,Hay98j. This work 
shares our objective of developing a Back/Morgan-style framework with tem- 
poral features. In particular, Utting and Fidge define a real time specification 
construct similar to our [(/?, ■0] in that (p and ip may be formulae ex- 

pressing properties not just of the current state, but also of the run. They give 
semantics to this construct by translation to an instance of Morgan’s refine- 
ment calculus in which states are taken to correspond to what we have called 
“points” in this paper, i.e. a run together with a “current time” variable. The 
construct *[(/?, ■0] is represented by a predicate transformer over such states that 
“changes only the current time.” The resulting calculus is like ours in many 
ways. A crucial difference, however, is that it does not support refinements like 
{Oi^}; *[true, p] A -k [true, p] which are essential to the example in Section 8. 
Their framework does not make use of explicit justifications, but, intuitively, 
it avoids circular temporal refinements like that discussed in Section 5 by allo- 
wing assertions to be “justified” only by constraints that have “executed” at an 
earlier time. Thus, in the refinement above, while p can be asserted after the 
specification *[true, (/?], we cannot assert <><p before it. 

We believe that examples similar to that of Section 8 will be quite significant 
in practice. For example, it is common to optimize protocols in distributed sy- 
stems by noting that in certain circumstances, one can omit sending a message 




On Refinement and Temporal Annotations 



201 



because its intended effect is already guaranteed to occur (perhaps because of 
actions and events due to occur in the future). This is particularly useful when 
developing protocols that optimize use of bandwidth by minimizing the number 
of messages sent. Of course, our framework needs generalization to be applied 
to such protocols — this is a topic of current work; we have discussed conside- 
rations applying to the generalization elsewhere [MM98]. The example in Sec- 
tion 8 already suggests that mechanical assistance for the bookkeeping involved 
in maintaining and updating justification sets and justification transformations. 
In addition, we believe that results derived using temporal logic theorem pro- 
vers and related tools can be incorporated in derivations in our framework by 
providing input into rules such as Strengthen Spec and Valid Assertion. There 
is clearly a wide range of topics and issues to explore in developing this topic. 

The definition of valid refinement we have introduced here demonstrates that 
it is possible to develop a refinement calculus involving temporal assertions that 
has an easily understood semantics and supports some quite intuitive reasoning. 
One point is worth making however: because it involves a quantification over the 
syntactic contexts C{-), our definition of valid refinement is likely to be sensitive 
to the syntax of programs. It is therefore worthwhile to consider more semantic 
notions of valid refinement. In a followup paper we will define a semantic coun- 
terpart of valid refinement, and prove the soundness of the rules used in this 
paper, as well as a host of other rules, with respect to this semantic notion. 

We are also in the process of developing the framework to deal with several 
extensions of the language studied here. One extension introduces a way of pla- 
cing labels within a program and enriches the assertion language to allow it to 
express properties of the program locations corresponding to these labels. This 
allows some interesting properties such as termination to be expressed directly 
in the programming language (rather than indirectly, through a statement about 
refinement, as we have done here in the Regress Diamond rule). Use of labels also 
provides a tighter coupling between program structure and temporal assertions 
that enables some quite useful refinement rules. Finally, we are also in the process 
of extending the framework developed here to include quantification over “lo- 
cal predicates” [EvdMM98,EvdMM00]: this provides a framework generalizing 
knowledge-based [FHMV95] and knowledge-oriented programs [MK93]. 



References 



[BvW98] 

[EvdMM98] 



[EvdMMOO] 

[FHMV95] 



R. J. Back and von Wright. Refinement Calculus: A systematie approach. 
Graduate Texts in Computer Science. Springer Verlag, 1998. 

K. Engelhardt, R. van der Meyden, and Y. Moses. Knowledge and the 
logic of local propositions. In I. Gilboa, editor, Proc. Conf on Theoretical 
Aspects of Reasoning about Knowledge, pages 29-41. Morgan Kauffman, 
July 1998. 

K. Engelhardt, R. van der Meyden, and Y. Moses. A refinement fra- 
mework supporting reasoning about knowledge and time. In Proc. of 
FOSSACS ’2000. Springer Verlag, March 2000. 

R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Reasoning about 
Knowledge. MIT Press, Cambridge, Mass., 1995. 




202 



R. van der Meyden and Y. Moses 



[GS86] 

[Hay98] 

[HL95] 

[Hoa67] 

[Hol89] 

[HU97] 

[Lam94] 

[MK93] 

[MM98] 

[Mor87] 

[Mor90] 

[UF96] 

[UF97] 

[Win86] 



S. Graf and J. Sifakis. A logic for the description of non-deterministic 
programs and their properties. Information and Control, 68(1-3) :254- 
270, January/February/March 1986. 

I. Hayes. Separating timing and calculation in real-time refinement. In 

J. Grundy et al, editor, International Refinement Workshop & Formal 
Methods Pacific, Proc. IRW/FMP’98, Series in Discrete Mathematics 
and Theoretical Gomputer Science, 1998. 

K. Havelund and K. Larsen. A refinement logic for the fork calculus. In 
S. T. Vuong and S. T. Ghanson, editors. Protocol Specification, Testing 
and Verification XIV, pages 5-20. Ghapman and Hall, 1995. IFIP WG 
6.1 Symposium. 

G.A.R. Hoare. An axiomatic basis for computer programming. Comm. 
ACM, 12:516-580, 1967. 

S. Holstrom. A refinement calculus for specifications in Henessy-Milner 
logic with recursion. Formal Aspects of Computing, 1:242-272, 1989. 

I. Hayes and M. Utting. A sequential real-time refinement calculus. Tech- 
nical Report UQ-SVRG-97-33, Software Verification Research Centre, 
University of Queensland, 1997. URL http://www.svrc.it.uq.edu.au/. 
Leslie Lamport. The temporal logic of actions. ACM Transactions on 
Programming Languages and Systems, 16(3):872-923, May 1994. Also 
appeared as DEC SRC Research Report 79. 

Y. Moses and O. Kislev. Knowledge-oriented programming. In Proc. 12th 
ACM Symp. on Principles of Distributed Computing, pages 261-270, 
1993. 

R. van der Meyden and Y. Moses. Top-down considerations on dis- 
tributed systems. In Proc. 12th Int. Symp. on Distributed Computing, 
DISC’98, pages 16-19, Andros, Greece, Sept 1998. Springer LNCS No. 
1499. 

J. M. Morris. A theoretical basis for refinement and the programming 
calculus. Science of Computer Programming, 9(3):287-306, 1987. 

C. Morgan. Programming from Specifications. Prentice Hall, New York, 
1990. 

M. Utting and C. Fidge. A real-time refinement calculus that chan- 
ges only time. In He Jifeng, editor, Proc. 7th BCS/FACS Refinement 
Workshop, Electronic Workshops in Computing. Springer, 1996. 

M. Utting and C. Fidge. Refinement of infeasible real-time programs. In 
Proc. Formal Methods Pacific ’97, Series in Discrete Mathematics and 
Theoretical Computer Science, pages 243-262, 1997. 

G. Winskel. A complete proof system for SCSS with modal assertions. 
Fundamenta Informaticae, IX:401-419, 1986. 




Generalizing Action Systems to Hybrid Systems 



R.-J. Back, L. Petre and I. Porres 

Turku Centre for Computer Science (TUCS) 
Lemminkaisenkatu 14A, FIN-20520 Turku, Finland 
{Ralph- Johan . Back , Luigia . Petre , Ivan . Porres}@abo . f i 



Abstract. Action systems have been used successfully to describe di- 
screte systems, i.e., systems with discrete control acting upon a discrete 
state space. In this paper we extend the action system approach to hybrid 
systems by defining continuous action systems. These are systems with 
discrete control over a continuously evolving state, whose semantics is 
defined in terms of traditional action systems. We show that continuous 
action systems are very general and can be used to describe a diverse 
range of hybrid systems. Moreover, the properties of continuous action 
systems are proved using standard action systems proof techniques. 



1 Introduction 

A system using discrete control over continuously evolving processes is referred 
to as a hybrid system. The use of formal methods and models to describe hybrid 
systems has attracted quite a lot of attention in the last years, with a number 
of different models and formalisms being proposed in the literature (see e.g., [2, 
13,9]). We continue this line of research, essentially proposing what we believe is 
a new and very general model for hybrid systems, based on the action systems 
paradigm. 

Action systems [4] have been used successfully to model discrete systems, i.e., 
systems that use a discrete control upon a discrete state space. Their original 
purpose was to model concurrent and distributed systems. In this paper we 
show that the action system model can be adapted to model hybrid systems. 
An important advantage of this adaption is that standard modeling and proof 
techniques, developed for ordinary action systems, can be reused to model and 
reason about hybrid systems. 

Our extension of action systems to hybrid systems is based on a new approach 
to describing the state of a system. Essentially, our state variables will range 
over functions over time, rather than just over values. This allows a variable to 
capture not only its present value, but also the whole history of values that the 
variable has had, as well as the default future values that the variable will receive. 
Updating a state variable is restricted so that only the future behavior of the 
variable can be changed, not its past behavior. We will refer to action systems 
with this model of state as continuous action systems. Continuous action systems 
are inspired by, but differ from, the extension of action systems to hybrid systems 
described in [14]. 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 202-213, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Generalizing Action Systems to Hybrid Systems 203 



Proofs about action system properties are based on the refinement calcu- 
lus [7]. This extends the programming logic based on weakest precondition pre- 
dicate transformers that was proposed in [10]. Action systems are intended to be 
stepwise developed, the correctness of these refinement steps being verified wit- 
hin the refinement calculus. Thereby, we get an implicit notion of refinement also 
for continuous action systems. Even though the refinement of hybrid systems is 
not the purpose of this paper, the approach we adopt for hybrid systems fits 
well into the refinement calculus and it can be used for systems where correct 
construction is a central concern. 

The refinement calculus is based on higher-order logic, which in turn is an ex- 
tension of simply typed lambda calculus. Functions are defined by A-abstraction 
and can be used without explicit definition and naming. As an example, the fun- 
ction that calculates the successor of a natural number is defined as (An • n -I- 1). 
We denote by f.x the application of the function / to the argument x so that, 
e.g., (An • n -I- l).l = 2. A binary relation R C A x B is here considered as a 
function R : A ^ fB, i.e., mapping elements in A to sets of elements in B. 

We proceed as follows. The action system model is briefly reviewed in Sec- 
tion 2. We define the continuous action systems in Section 3. Their semantics 
is specified by explaining how to translate them into ordinary action systems. 
Section 4 contains examples of hybrid systems, modeled using our framework. In 
Section 5 we show how to prove safety properties for continuous action systems. 
Conclusions and comparisons to related work are presented in Section 6. 



2 Action Systems 

We start by giving a brief overview of the action systems formalism. An action 
system is essentially a discrete state space updated by a discrete control mecha- 
nism. The state of the system is described using attributes or program variables. 
We define a finite set Attr of attribute names and assume that each attribute 
name in Attr is associated with a non-empty set of values. This set of values is 
the type of the attribute. If the attribute x takes values from Val, we say that 
X has the type Val and we write it as a; : Val. We consider several predefined 
types, like Real for the set of real numbers, Real+ for the set of non-negative real 
numbers, and Bool for the boolean values {F,T}. 

An action system consists of a finite set of attributes, used to observe and 
manipulate the state of the system, and a finite set of actions that act upon the 
attributes. This set of actions models the control mechanism over the state of 
the system. An action system A has the following form: 

A = |[var X : Val • S'o ; do AiD . . . □ Am od ]| : y 

Here x : Val = a;i : Val\, ... , : Vain are the local attributes of the system. So 

is a statement that initializes the attributes, while ‘Aj = Si’, i = 1, . . . ,m, 
are the actions of the system. The boolean expression gi is the guard of the 
action Ai and Si is the body of the action. The attributes y = t/i, . . . ,yk are 
defined in the environment of the action system and called imported attributes. 




204 



R.-J. Back, L. Petre and I. Porres 



Attributes in x may be exported, in the sense that they can be read, or written, 
or both read and written by environment actions. In this case, we decorate these 
attributes with — , + or *, respectively. An action A of the form ‘g — >■ S” is a 
guarded statement that can be executed only when g is enabled, i.e., when g 
evaluates to T. The body S of an action is defined by the following syntax: 

S ::= abort | skip \ x : = e \ [x : = x'\R] \ if g then Si else S 2 fi | Si ; S2 

Here x is a list of attributes, e is a corresponding list of expressions, x' is a list of 
variables standing for unknown values, and i? is a relation specified in terms of x 
and x' . Intuitively ‘skip’ is the stuttering action, ‘x: =e’ is a multiple assignment, 
‘if g then Si else S 2 fi’ is the conditional composition of two statements, and 
‘Si ; S2’ is the sequential composition of two statements. The action ‘abort’ always 
fails and is used to model disallowed behaviors. Given a relation i?(x, x') and a 
list of attributes x, we denote by [x : = x'\R\ the non- deterministic assignment 
of some value x' G R.x to x (the effect is the same as abort, if R.x = 0). 
The semantics of the actions language has been defined in terms of weakest 
preconditions in a standard way [10]. Thus, for any predicate q, we define 

w;p(abort, g) =F 

wp(skip,g) =q 

wp{x : = e,q) = q[x := e] 

wp{[x : = x'\R],q) = (Vx' € R.x ■ q[x \= x']) 

wp{Si ■, S 2 ,q) = wp{Si,wp{S 2 , q)) 

icp(if g then else S 2 fi, 1?) = if 5 then wp{Si,q) else wp{S 2 ,q) fi 

The term q[x := e] stands for the result of substituting e for all free occurrences 
of variable x in predicate q. 

The execution of an action system is as follows. The initialization S'o will 
set the attributes x to some specific values, using a sequence of possibly non- 
deterministic assignments. Then, enabled actions are repeatedly chosen and exe- 
cuted. The chosen actions will change the values of the attributes in a way that 
is determined by the action body. Two or more actions can be enabled at the 
same time, in which case one of them is chosen for execution, in a demonically 
non-deterministic way. The computation terminates when no action is enabled. 
Actions systems model parallel execution by interleaving atomic actions in a 
demonically non-deterministic fashion. 

In the following, we specify a notion of time and show how to model attributes 
that are functions of time. These extensions to the action systems formalism 
define a new model for hybrid systems. 

3 Continuous Action Systems 

A system using a discrete control mechanism over a continuously evolving state 
is referred to as a hybrid system. In this section we introduce continuous action 
systems, an extension of the action system formalism to model hybrid systems. 

A continuous action system consists of a finite set of time-dependent attri- 
butes together with a finite set of actions that act upon them. The attributes 




Generalizing Action Systems to Hybrid Systems 205 



can range over discrete or continuous domains and form the state of the system. 
A continuous action system is of the form: 

C = |(var X : Real+ — >• Val • So ; do pi ^ SiD . . . □ Qm ^ Smod )| : y (1) 

Intuitively, executing a continuous action system proceeds as follows. There 
is an implicit variable now, that shows the present time. Initially now = 0. 
The initialization Sq assigns initial time functions to the attributes x\,. . . ,x„. 
These time functions describe the default future behavior of the attributes, whose 
values may, thereby, change with the progress of time. The system will then start 
evolving according to these functions, with time (as measured by now) moving 
forward continuously. The guards of the actions may refer the value of now, as 
may expressions in the action bodies and the initialization statements. 

As soon as one of the conditions pi, . . . ,gm becomes true, the system choo- 
ses one of the enabled actions, say gi — >■ Si, for execution. The choice is non- 
deterministic if there is more than one such action. The body Si of the action 
is then executed. Execution is atomic and instantaneous. It will usually change 
some attributes by changing their future behavior. We write x e for an as- 
signment rather than x := e, to emphasize that only the future behavior of the 
attribute x is changed to the function e and the past behavior remains unchan- 
ged. Attributes that are not changed will behave as before. After the changes 
stipulated by Si have been done, the system will evolve to the next time instance 
when one of the actions is enabled, and the process is repeated. The next time 
instance when an action is enabled may well be the same as the previous, i.e., 
time does not need to progress between the execution of two enabled actions. 
This is usually the case when the system is doing some (discrete, logical) compu- 
tation to determine how to proceed next. Such computation does not take any 
time. It is possible that after a certain time instance, none of the actions will 
be enabled anymore. This just means that the system will continue to evolve 
forever according to the functions last assigned to the attributes. 

As an example of a continuous action system consider the system in Fig. 1. 
The attributes x and clock are first initialized to the constant function (At • 0) 
and the switching function up is set to the constant function (At • F). The guard 
of the first action is immediately enabled at time 0, so the first action’s body 
is executed immediately. The future behaviors of clock and x are changed to 
increase linearly from 0, and the future behavior of up is changed to the constant 
function (At • T), i.e., up is set to be T in all the future time instances. After 
this, the system starts to evolve by advancing time continuously. In particular, 
the value of x increases linearly, depending on time. When x gets value 1, the 
second action is enabled. The clock is then first reset, the future behavior of 
x is changed to decrease linearly with the clock value, and the future value of 
up is set to the constant F. This continues until x reaches 0, when the first 
action is again enabled, changing x to increase again, and so on. The effect of 
these two actions is a sawtooth-like behavior, where the value of x alternatively 
increases and decreases forever. The evolution of the system is also described in 
Fig. 1, showing each attribute on the same time domain together with the points 




206 



R.-J. Back, L. Petre and I. Porres 



Saw = |( var x, clock : Real+ — >■ Real ; up : Real+ — >■ Bool 

• X {\t ■ 0) ; clock {\t ■ 0) -,up {Xt ■ F); 
do x.now = 0 A -lup.now — > 
clock {Xt ■ t — now)-, 

X clock ;up:— {Xt ■ T) 

□ x.now = 1 A up.now — > 
clock {Xt ■ t — now)-, 

X {Xt ■ 1 — clock. t); 

up {Xt ■ F) 
od 

)l 



Fig. 1. Continuous action system Saw (left) and its behavior (right). 



in time where a discrete action is performed. We see that a continuous action 
system is just a non-deterministic way of defining a collection of time dependent 
functions. One of the main advantages of this model for hybrid computation is 
that both discrete and continuous behavior can be described in the same way. 
In particular, if the attributes are assigned only constant functions, we obtain a 
discrete computation. 

Semantics of continuous action systems. Let 6 be the continuous action 
system in (1). We explain the meaning of C by translating it into an ordinary 
action system. Its semantical interpretation is given by the following (discrete) 
action system 6: 



1st 2nd 1st 2nd 

act act act act 




e 4 |[ 

II 

Here the attribute now is declared, initialized, and updated explicitly. It models 
the time moments that are of interest for the system, i.e., the starting time 
and the succeeding moments when some action is enabled. The value of now is 
updated by the statement N = now := next. gg. now . Here gg = gi gm 

is the disjunction of all guards of the actions and next is defined by 



var now : Real+,a; : Real+ — Val 
• now := 0 ; So ; N; 
do gi ^ Si; NO . . . D ^ Sm ; N od 
■■ V 



next. gg.t = 



min{t' > t 

t, 



gg.t'}, it 3 t' > t such that gg.t' 
otherwise. 



(3) 



The function next models the moments of time when at least one action is enab- 
led. Only at these moments can the future behavior of attributes be modified. If 




Generalizing Action Systems to Hybrid Systems 207 



no action will ever be enabled, then the second branch of the definition will be 
followed, and the attribute now will denote the moment of time when the last 
discrete action was executed. In this case the discrete control terminates and 
the attributes will evolve forever according to the functions last assigned. We 
assume in this paper that the minimum in the definition of next always exists 
when at least one guard is enabled in the present or future. Continuous action 
systems that do not satisfy this requirement are considered ill-defined. 

The future update x : — e is defined by x : — e = x := xjnowje where 
xjtole = (At - if t < to then x.t else e.t fi). Thus, only the future behavior of x 
is changed by the future update. It is important to note that all the attributes of 
a continuous action system are functions of time, except for now. As an example, 
the statement x : — (At • t) updates the default future of x with an increasing 
function, while x (At • now) updates it with a constant function. We write 
X c as a shorthand for x {Xt ■ c) when c is a constant function. 

This explication of a continuous action system shows it essentially as a collec- 
tion of time functions Xg, . . . ,x„ over the non-negative reals, defined in a step- 
wise manner. The steps form a sequence of intervals Iq, Ii, I 2 , ■ ■ ■ , where each 
interval Ik is either a left closed interval of the form [ti . . .ti+i) or a closed in- 
terval of the form [ti,ti], i.e., a point. The action system determines a family 
of functions xg, . . . ,x„ which are stepwise defined over this sequence of inter- 
vals and points. The extremes of these intervals correspond to the control points 
of the system where a discrete action is performed. In the Saw example, the 
sequence of intervals is [0], [0 ... 1), [1 ... 2), [2 ... 3), .. . As such, the continuous 
action system can be best understood as the limit of a sequence of approximati- 
ons of the time functions xg, . . . , x„, defined over successively longer and longer 
intervals [0 . . . where i = 0,1,2,.... Looking at the example in this way, its 
sequence of initial segments is [0] , [0 . . . 1) , [0 . . . 2) , [0 . . . 3) , . . . and the defined 
approximations are successively: 



Xo.t = 0,0<t-, Xl.t = t,0<t-, X2-t = 



t, 0<t <1 



X3-t = 



t, 0 < t < 1 
t-2,2<t 



For each attribute Xj there is a defined history of its past, i.e. the interval 
[0,now), its present value in the point [now], and a default future. The execution 
of an action can modify the present value of an attribute and its default future, 
but not its past. It is important to note that such a definition does not necessa- 
rily determine a single function for x^. Because of the non-deterministic choices 
involved, there might be a collection of such function tuples that are allowed 
by the continuous action system, and we cannot know which one of these will 
actually be the one the system follows. Thus, the system behavior may only be 
determined up to a certain tolerance, and any system behavior that is within 
these limits is possible. 

Another important observation regards the possibility of Zeno behavior. That 
is, our definition does not guarantee that the sequence of generated intervals 
will cover all the non-negative reals. They might only cover an initial segment of 
these. In this case, there is a limit point of time that the action system reaches 




208 



R.-J. Back, L. Petre and I. Porres 



when the number of iterations reaches infinity. These systems are well-defined 
but the simple explication of the behavior of the hybrid system is then not 
sufficient. For this, we further assume that the system is restarted at the limit 
point, and repeat the process again. This is meaningful if all the attribute values 
converge to a well-defined value in the limit. This restart can be carried out 
as many times as needed. Thus, a continuous action systems may have multiple 
limit points in its execution. However, the standard action system semantics does 
not allow multiple limit points, so this is a point where the semantics has to be 
extended. For simplicity, we assume in the sequel that there is no Zeno-behavior 
and a single limit point is sufficient. The absence of Zeno behavior means that 
the action system will define the values of the attributes for the whole domain 
of Real+. 

A simple way of reaching a limit point is when a control computation (where 
the time does not advance) does not terminate. This means that the conti- 
nuous behavior of the system is stuck at the last time instance reached. Non- 
termination of the control computation is most certainly undesired and unin- 
tended. This means that is desirable to prove that control computations where 
time does not advance always terminate. 

Composing continuous action systems In order to model complex hybrid 
systems, where several different subsystems or components evolve concurrently, 
we need to formally define the composition of continuous action systems. Two 
actions systems communicate by means of imported and exported variables. 
We can also model other means of communication using the action systems 
framework [6], but this is out of the scope of this paper. For parallel composition, 
we may also need to rename certain attributes of the system when describing 
more complex systems, but we ignore this aspect here for brevity. 

We define the parallel composition of two continuous systems by using essen- 
tially the parallel composition operator for ordinary action systems [5]. Thus, if 
we have two continuous action systems 6 and 6' as in (1), then their parallel 
composition is the continuous action system C || C' defined as follows: 

C II C' = |( var X : Real+ — >■ Val,x' : Real+ — >■ Val'; 

• So;S'o-, 

do gi — >■ SiD . . . □ Qm — > SmO g'l — >■ 

)l : {y^y') - (zuz') 

where the unprimed entities originally belonged to 6 and the primed entities 
to S'. We assume here that the variables x and x' are disjoint. We need to 
combine the continuous action systems before we translate them into discrete 
action systems, because the local variable now appears in both C and S'. By 
combining the continuous action systems first, we ensure that C || C' uses a 
single now variable, which is checked by actions from both components. 

Thus, parallel composition essentially combines the attributes of the two 
component systems and, therefore, their continuous evolution. Because the ac- 
tions in the parallel composition are the combined actions of the two systems, 



S' od 



(4) 




Generalizing Action Systems to Hybrid Systems 209 



discrete changes will usually occur more frequently. An action in one component 
system may depend on an attribute in the other component system, which may 
be again modified by actions of the former system. This means that the behavior 
of a system in a parallel composition is usually different from the behavior of 
the system when it is alone. 

4 Modeling Systems 

In this section we illustrate how a hybrid system can be described as a conti- 
nuous action system. We show how to model real-time systems, systems using 
differential equations, and also a press that reacts to external signals from the 
environment. 

We can use clock variables to measure the passage of time and to correlate 
the execution of an action with the time. A clock variable is an attribute that 
measures the time elapsed since it was set to zero. Assume that c is an attribute 
of type Real. We then use the following definition for resetting the clock c: 

reset{c) = c:— {Xt ■ t — now) 

This definition is just a convenience for correlating the behavior of a system 
with the passage of the time. Since a clock variable is a regular attribute, we 
can define as many clocks as needed and reset them independently. It is also 
possible to do arithmetic operations with clock variables, to use time constrains 
as guards, or to refer to past values of an attribute, e.g. x.{now — 1). Hence, 
continuous action systems can be used to model real-time systems. 

The behavior of a dynamic system is often described using a system of dif- 
ferential equations. We can allow this kind of definitions by introducing the 
shorthand 



X f = [x : — y \ y.now = x.now Ay = f.y, y > now] 

This will assign to a; a time function that satisfies the given differential equation 
and which is such that the function x is continuous at now. As an example, 
if / = (At • c), where c is a constant value, then we have that x {Xt ■ c) = 
X : — {Xt ■ x.now + c* {t — now)). Thus, we can use continuous action systems 
to express hybrid systems using either explicit functional expressions or implicit 
differential equations. 

An example of a press from a metal processing factory [12] is shown in Fig. 2. 
The press works as follows. First, its lower part is raised until the middle position. 
Then an upper conveyor belt feeds a metal blank into the press. When the press 
is loaded (signalled by sensor i being T), the lower part of the press is raised 
until the top position and the blank is forged. The press will then move down 
until the bottom position and the forged blank is placed into a lower conveyor 
belt. When the press is unloaded (signalled by sensor^ being T), its lower part 
is raised to the middle position, ready for being loaded again. 

The press works cyclically and keeps evolving from one phase to another. We 
model these phases with a task attribute in the continuous action system Tress 




210 



R.-J. Back, L. Petre and I. Porres 




Tress = 

|( var p, c : Real+ — Real; 

task : Real+ — >■ {loading, pressing, unloading, 

nioving2unload, moving2load} 

• reset{c) ;p middle ; task loading-, 
do task. now — loading A sensori.now — > 
reset{c) ; p (At • middle + v * c.t)-, 
task pressing 

□ task.now = pressing A p.now = top — > 
reset{c) ; p (At • top — u * c.t); 
tasfc moving2unload 

□ task.now = moving2unload A p.now — bottom — > 
p : — bottom ; task : — unloading 

□ task.now = unloading A -^sensor 2 .now 
reset(c) ; p (At • bottom + c * c.t); 
tasfc moving2load 

□ task.now = moving2load A p.now = middle — > 
p middle ; tasfc loading 



)| : sensor 1 , sensor 2 



Fig. 2. Press functioning as a continuous action system. 



shown in Fig. 2. This attribute can have the discrete values loading, pressing, 
moving2unload, unloading, moving2load. The continuous attribute p shows the 
position of the press plate and is, at different moments in time, a linearly in- 
creasing, a linearly decreasing or a constant function of time. The positions of 
reference for the press, i.e. bottom, middle, and top, are given as parameters. 

The press example is a typical part of a control system. This kind of systems 
are essentially composed from several components that work together in order 
to meet the requirements of the overall system. Thus, an important feature of a 
component is its interaction with the environment. In the case of the press the 
interaction with the environment (two conveyor belts) is modeled with several 
sensors. The sensors are modeled as imported attributes that can be changed 
by the environment at any time. The press reads the values that sensori and 
sensor 2 display, but these values are updated by the environment in a way we 
are not interested in here. 

Other types of hybrid systems can be modelled as well using continuous 
action systems. Some more examples can be found in [8,14]. 



5 Safety Properties 

Properties of continuous action systems can be established by proving that these 
properties hold for the corresponding discrete action systems. Hence, there is no 
special proof theory for continuous action systems, but the standard proof theory 
for action systems suffices (with the exception that we may need to consider 




Generalizing Action Systems to Hybrid Systems 211 

multiple limit points, as was mentioned earlier). In this paper, we concentrate 
on safety properties, as in many cases they are the kind of properties that we 
want to initially establish for hybrid systems. 

A common characterization for a safety property is that nothing ‘bad’ hap- 
pens during the lifetime of the system. Put in another way, a safety property 
is a ‘good’ property G that always holds, i.e., (Vt > 0 • G.t). We can esta- 
blish this property for the action system C in (1) by proving that a property 
/ = (Vt I 0 < t < now ■ G' .t) is an invariant of the corresponding discrete 
action system C, where (Vt > 0 • G' .t G.t). This implies the safety property, 
provided that the system does not have a Zeno behavior and does not terminate 
(i.e., now will go to infinity in the system). More precisely, the safety property 
G holds when the system is started in an initial ^tate satisfying P, if and only 
if the following three conditions are satisfied for 6: 

Vt > 0 - G'.t ^ G.t 
P wp{now 0 ■, So N, I) 

I Agi ^ wp{Si i = 1,. . . ,m 

Consider the press example in Fig. 2. We consider two safety properties. First, we 
want to prove that the movable plate of the press does not pass the limits of the 
machine. Formally this is expressed by (Vt > 0 • bottom < p.t < top), where p is 
the vertical position of the plate. Second, we want to prove that p is a continuous 
function on Real+. We need to choose an invariant I that allows us to establish 
the safety property (Vt > 0-bottom < p.t < top)A{p continuous on Real+) using 
the proof rule above. 

For the first conjunct of the safety property, an invariant of the form (Vt | 0 < 
t < now ■ bottom < p.t < top) would be sufficient. However, to prove the global 
continuity property, we need a stronger invariant, which also ensures that the 
press remains in the correct position during the loading and unloading opera- 
tions. The following invariant I is sufficient for establishing the required safety 
property: 

I = [p continuous on [0, now] A (Vt j 0 < t < now ■ bottom < p.t < top) A 
(Vt I 0 < t < now ■ task.t — loading ^ p.t = middle) A 
(Vt I 0 < t < now ■ task.t = unloading p.t = bottom)) 



The proof must establish that the invariant is satisfied by the initialization from 
the moment 0 until the first moment an action is enabled and during the time 
elapsed between the execution of two actions. The discharging of the proof ob- 
ligations can be found in [8] . 

6 Conclusions and Related Work 



In this paper we have shown how to generalize the action systems framework 
for modeling hybrid systems, by introducing the notion of continuous action 




212 



R.-J. Back, L. Petre and I. Porres 



systems. We model attributes in continuous action systems as functions over 
time that are updated in a way that only changes their present and future 
behavior. Essentially, this amounts to extending the notion of state with both 
an history and a default future, thus generalizing the classical action systems 
approach that only handles the present state. 

This extension allows us to model systems that combine discrete control 
with continuous behavior, the latter either defined by explicit functions of time 
or by differential equations. We have also shown that the continuous action sy- 
stems model provides a simple way of defining the parallel composition of hybrid 
systems, using communication by means of imported and exported attributes. 
Finally, we explained how to prove safety properties of continuous action systems 
using the classical invariant method. We illustrated these concepts with a simple 
example, while a complete case study can be found in [3]. 

The idea of extending an existing formalism to model real-time systems by 
introducing a variable representing the time was presented by Abadi and Lam- 
port in [1]. We follow the same approach here, extending an existing formalism 
to handle hybrid systems instead of creating a new formalism specific for such 
systems. This provides a clear advantage, as we can reuse all the previous results 
on action systems to study real-time and hybrid systems models. 

Ronkkd and Ravn [14] have already proposed a model for combining action 
systems and continuous behavior, called hybrid action systems. In their model, 
the continuous evolution of a variable is modeled as a special kind of atomic 
action. An atomic action cannot be interrupted and its bounds are specified in 
advance. This affects the parallel composition of systems, since different simulta- 
neous actions must be combined into a sequence of atomic actions. In the worst 
case, the parallel composition of two systems with n and m actions leads to 
a system with n * m actions. Also, there is no implicit notion of time in their 
approach, which is not intended for modeling real-time systems. In our model, 
parallel composition of two such systems gives a continuous action system with 
n -|- m actions. This is a major simplification for handling large systems. 

These advantages still exist when comparing our formalism with the hybrid 
automata [2], The number of states in the parallel composition of two hybrid 
automata is also the product of the number of states of the original automata. 
Note that in the hybrid automata formalism, transitions are fired synchronously, 
while in the action system formalism actions are selected and executed asynchro- 
nously. The continuous action system formalism is more expressive than hybrid 
automata, as it allows references to historical values of the attributes in guards 
and expressions. Compared to hybrid automata, our model also allows the at- 
tributes to be selectively updated: only those attributes that are changed need 
to be mentioned in an action. 

Another interesting model for hybrid systems is provided by phase transition 
systems [11]. In this model, the continuous behavior of the system is modeled 
using a finite set of activities. However, only one activity can be enabled at a 
certain time. Thus, a single activity completely defines the continuous behavior 
of a system. Again, our model allows the attributes to be selectively updated. 




Generalizing Action Systems to Hybrid Systems 213 



The next step in the development of the continuous action systems formalism 
is to illustrate their stepwise refinement. This will provide for the derivation of 
executable control programs that are correct with respect to their specification, 
given as a continuous action system. 

Acknowledgement We would like to thank Cristina Cerschi, Mauno Ronkko 
and Hannu Toivonen for our inspiring discussions as well as to the anonymous 
referees for their useful comments on the topics covered in this paper. 



References 

1. M. Abadi and L. Lamport. An old-fashioned receipe for real time. ACM Transac- 
tions on Programming Languages and Systems, 16(5):1543-1571, 1994. 

2. R. Alur, C. Courcoubetis, T.A. Henzinger, and P.H. Ho. Hybrid automata: an 
algorithmic approach to the specification and verification of hybrid systems. In 
R.L. Grossman, A. Nerode, A.P. Revn, and H. Rischel, editors. Hybrid Systems 1, 
volume LNCS 736, pages 209-229. Springer- Verlag, 1993. 

3. R. J. R. Back and C. Cerschi. Modeling and verifying a temperature control system 
using hybrid action systems. In Proc. of the 5th Int. Workshop in Formal Methods 
for Industrial Critical Systems, 2000, to appear. 

4. R. J. R. Back and R. Knrki-Suonio. Decentralization of process nets with centrali- 
zed control. In 2nd Symp. on Principles of Distributed Computing, volume LNCS 
873, pages 131-142. ACM SIGACT-SIGOPS, 1983. 

5. R. J. R. Back and K. Sere. Stepwise refinement of parallel algorithms. In Science 
of Computer Programming 13, pages 133-180, 1991. 

6. R. J. R. Back and K. Sere. From action systems to modular systems. In Formal 
Methods Europe (FME ’94), volume LNCS 873, pages 1-25. Springer- Verlag, 1994. 

7. R. J. R. Back and J. von Wright. Refinement Calculus - A Systematic Introduction. 
Springer- Verlag, 1998. 

8. R.J. Back, L. Petre, and I. Porres. Generalizing action systems to hybrid systems. 
Technical Report 307, TUGS Tnrku Centre for Computer Science, 1999. 

9. M.S. Branicky. General hybrid dynamical systems: modeling, analysis and control. 
In R. Alnr, T. A. Henzinger, and E. D. Sontag, editors, Hybrid Systems III, volume 
LNCS 1066, pages 186-200. Springer- Verlag, 1996. 

10. E. W. Dijkstra. A Discipline of Programming. Prentice-Hall International, 1976. 

11. Y. Kesten, Z. Manna, and A. Pnneli. Verification of clocked and hybrid systems. In 
Lectures on Embedded Systems, volume LNCS 1494, pages 4-73. Springer- Verlag, 
1998. 

12. C. Lewerentz and T. Lindner. Formal Development of Reactive Systems: Case 
Study Production Cell., volume LNCS 891. Springer- Verlag, 1995. 

13. A. Nerode and W. Kohn. Models for hybrid systems: antomata, topologies, con- 
trollability, observability. In R.L. Grossman, A. Nerode, A.P. Revn, and H. Rischel, 
editors. Hybrid Systems I, volume LNCS 736, pages 317-356. Springer- Verlag, 1993. 

14. M. Ronkko and A.P. Ravn. Action systems with continuons behaviour. In P. J. 
Antsaklis, W. Kohn, M. Lemmon, A. Nerode, and S. Sastry, editors. Hybrid Systems 
V, volume LNCS 1567, pages 304-323. Springer- Verlag, 1999. 




Compositional Verification of Synchronous 

Networks 



Leszek Holenderski 

Dept, of Computing Sci., Technical University of Eindhoven 
PO Box 513, 5600 MB Eindhoven, The Netherlands 
L . HolenderskiOtue . nl 



Abstract. We present a logical framework for the verihcation of syn- 
chronous networks in an assert-commit style. It is based on the known 
observation that the Hoare rule for sequential composition is sound and 
complete for parallel composition as well. The calculus we develop in- 
side the framework is extremely simple, based on just one propositional 
tautology. Nevertheless, it is powerful enough to analyze the common 
proof strategies (monolithic, forward and backward) applied in automa- 
ted verification of such networks. This analysis leads to an incremental 
verification method, based on successive construction of the weakest pre- 
conditions, in which the backward proof is driven by the property being 
verified. In the case of finite synchronous networks this construction can 
be carried out via simple manipulations on circuits, and circuit optimizers 
can be used incrementally to simplify the complexity of such backward 
proofs. The method should hopefully be applicable in verification of soft- 
ware synchronous systems, since the current compilers for synchronous 
languages generate quite redundant circuits. 



1 Introduction 

We present a simple logical framework in which we study the problem of for- 
mal verification of reactive networks (systems formed as a hierarchy of reactive 
processes run in parallel). We consider only synchronous networks, i.e., systems 
obtained by parallel synchronous composition P\\Q, where P and Q are syn- 
chronous processes (or synchronous networks themselves). We are not concerned 
with the exact nature of the processes. For example, they can be specified in 
any of the common notations used for programming synchronous processes, like 
Argos [14,9], Esterel [7,9], Lustre [10,9] or Signal [6,9]. 

Properties of reactive systems can conveniently be specified in assert-commit 
style, by Hoare triples aPj3, where a and jS are formulae in some logic and P is 
a process (or a network) . Assertion a specifies a property of the environment in 
which P is executed, and commitment [3 specifies a property guaranteed by P. 
More precisely, aPf3 is valid iff E\\P satisfies /3, for any environment E which 
satisfies a. 

Our approach is based on the old observation [20] that under the above 
interpretation of Hoare triples the following inference rule is both sound and 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 214-227, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Compositional Verification of Synchronous Networks 215 



(relatively) complete for a parallel composition of processes P and Q: 

aPj, -fQ(3 
a{P\\Q)l3 

Thus, in order to prove a{P\\Q)f3, it suffices to find an intermediate formula 7 
such that aP^ and ^Q(3. 

The above rule is analogous to the well known rule for the sequential com- 
position of iterative programs. We draw on this analogy, by considering the 
strongest post-condition of a w.r.t. P (denoted by aP) and the weakest pre- 
condition of f3 w.r.t. Q (denoted by QP) in order to automate the task of finding 
the intermediate formula 7 . 

When verifying complex systems, especially software safety-critical systems, 
one is usually interested in verifying many relatively simple commitments under 
the same assertion. By a relatively simple commitment we mean the property 
which only a small part of the system contributes to. In this typical scenario, 
synthesis of 7 as QP turns out tojpe^articularly useful: by considering successive 
weakest pre-conditions QP and P{QP), one in fact considers automatically how 
the component processes contribute to property p. 

On the way, we develop a logical framework to study proof strategies for 
a{P\\Q)p. This framework leads to a simple calculus in which such strategies 
can be derived formally, just by manipulating propositional logic formulae. In 
fact, the whole calculus is based on just one propositional tautology: (pAq) — >■ r is 
equivalent to p — >■ (g — >■ r). The reason for this simplicity is given in Section 3.2. 

The logical framework is developed in two steps. First we consider the general 
case in which we are not concerned with any particular notation used to specify 
components of a{P\\Q)p. Next we instantiate the general framework in the con- 
text where processes P and Q are specified by synchronous sequential circuits, 
and formulae a and P are specified by synchronous observers [12]. Since synchro- 
nous observers can be translated to circuits, all the components of a{P\\Q)P can 
be represented by circuits, and this allows to simplify the proof of a{P\\Q)P via 
manipulations on circuits. 

The paper is organized as follows. In Section 2 we recall some basic theory 
of synchronous networks. In Section 3, the logical framework is developed, and 
three common strategies to prove a{P\\Q)P are derived. One of the strategies, 
which we call a backward proof with weakest observers, turns out to be more 
efficient than the other two, so we analyze it in more detail, in Section 5. We 
conclude with the proposal of a simple software tool to implement the strategy. 

2 Preliminaries 

We recall those parts of the theory of synchronous networks which establish the 
equivalence between the three views of synchronous processes: ( 1 ) as sets of tra- 
ces, (2) as sequential circuits and (3) as temporal logic formulae. In summary, 
synchronous parallel composition P\\Q can be described as |P] fl |Q] (the in- 
tersection of sets of traces), P U Q (the union of circuits understood as sets of 
boolean clauses), and P A Q (the conjunction of characteristic formulae). 




216 



L. Holenderski 



2.1 Processes as Traces 

We only consider pure reactive processes (i.e. those which manipulate boolean 
values or, more generally, values from finite domains). A process, say P, operates 
on a finite set of boolean variables (from some alphabet V) that represent the 
signals used by P to interact with its environment. We use the notation P{I; O) 
to indicate that process P has inputs I and outputs O, for 1,0 CV and IdO = 
0. (For now, in order to simplify presentation, we do not consider processes with 
local variables. We postpone them till Section 8.) 

The semantics of synchronous processes we use in this paper is borrowed 
from [11]. In synchronous computations time is assumed to be discrete. In 
any instant of time, every signal is either true or false (present/absent, on/off, 
up/down, . . . ). A computation of P{I', O) is represented by a trace which is an 
infinite sequence of reactions (n,r 2 , . . .), for n C V. Reaction n consists of all 
the signals present in the i’th instant of time; fl I is the stimulus of P and 
n O is the response of P to the stimulus. 

Let T := (2^)“ denote the set of all traces (over alphabet V). Traces t and 
t' are called compatible on variables X C V, denoted by t iff they differ 

only on variables from R \ A. Formally, for t = (ri, r 2 , . . .) and t' = (r(, r^, . . .), 

t t' iff Vi > l(ri n A = r' n A) 

Let |P] C T denote the behaviour of P{I; O) (the set of all its traces). We 
assume that the behaviour is always closed w.r.t. ~/uo (if t G [^1 and t ~/uO t' 
then t' e |P]). 

Processes can be put in parallel to form networks, say P\\Q. A reaction of a 
network, in some instant of time, consists in simultaneous execution of reactions 
by all its processes. Communication is realized by sharing signals with the same 
name. The communication protocol is that of instantaneous broadcast: a signal 
emitted by one process is perceived as present, in the same instant it was emitted, 
by all the processes that share the signal. 

In the sequel we assume that each signal can be emitted by at most one pro- 
cess. This data-flow model is directly applicable only to data-flow synchronous 
languages, like Lustre and Signal. However, the imperative model employed 
in Argos and Esterel (where a signal can be emitted by several processes) 
can easily be reduced to the data-flow model. Replace Pi(. . . ; o)||P 2 (- • ■ ; o) with 
Pi(. . . ; o)[oi/o]||P 2 (. . . ; o)[o2/o]||(5(oi, 02 ; o), where [x/y] denotes the substitu- 
tion of X for y, and Q is a process which emits o whenever oi or 02 is present. 

Under this assumption, together with the assumption about the behaviors 
being closed w.r.t. compatible traces, the behaviour of PjjQ can simply be for- 
malized as iPjlQ] := |P] n IQ]. 



2.2 Processes as Circuits 

The behaviors of synchronous networks can be specified in many ways [9]. In 
this paper we prefer to use synchronous sequential circuits since it is a common 




Compositional Verification of Synchronous Networks 217 



formalism to which other (pure) synchronous formalisms can be translated, as 
in [8,15]. 

By a circuit we mean a set of boolean clauses that specify both the combi- 
national and sequential part of the circuit. In order to avoid potential problems 
with causal correctness, we assume that all synchronous processes and networks 
we consider have no combinational cycles, when translated to circuits. 

In the sequel we use bold letters O, P and Q to denote circuits representing 
processes O, P and Q, respectively. Under the data-flow assumption, P\\Q can 
simply be specified by P U Q. (Formally, one can give the trace semantics to 
circuits, such that |P U Q] = |P] fl |Q|.) 

2.3 Processes as Formulae 

The behaviors of synchronous networks can also be specified by formulae in some 
logic C of traces. Usually, C is some variant of linear temporal logic. The simple 
logical framework we develop in this paper does not rely on any particular logic 
of traces. Although we need some assumptions about L, the assumptions are 
fairly standard and are satisfied by most temporal logics. 

Let a, /3 , 7 be formulae of C. As usual, t\= a stands for “trace t satisfies a” . 
Let |a] denote the set of all traces which satisfy a. As usual, |= a stands for “a 
is valid” , and denotes |a] = T. 

We assume that £ contains at least the following connectives: A, — >■ and □, 
and they are interpreted such that |a A /?] = |o;] fl |/3] , ^ a — >■ /3 iff |a] C |/3], 
t ^ iff a is satisfied by all suffixes of t. In addition, we require that 

1= a A /? — >■ 7 iff 1= a — >■ (/3 — >■ 7 ) (I) 

We also assume that £ is expressive enough to characterize synchronous 
processes, in the sense that for any process, say P, there exists a formula, say 
V, such that |P] = |P]. Such a formula is called a characteristic formula of a 
process. For example, the logic given in [11] is expressive in this sense. 

In the sequel we use calligraphic letters O, V and Q to denote characteristic 
formulae of processes O, P and Q, respectively. Obviously, P A Q is a characte- 
ristic formula of P\\Q- 

3 Logical framework 

3.1 Assert-commit Formulae 

A property of a reactive process can be regarded as a set of traces that have 
the property. As usual, we say that process P has property X iff |P] C X. We 
assume that properties are specified by formulae in the logic £ described in the 
previous section. As usual, we say that process P satisfies formula a, denoted 
by P [= a, iff |P] C |a]. 

In fact, we will specify behaviors of processes in the assert-commit style, by 
formulae of the form aP/3, where a,/3 G £. Intuitive meaning of such Hoare 




218 



L. Holenderski 



triples is the following: aP(3 is valid iff P has property /? whenever executed in 
an environment which has property a. Formally, 

h aP[3 iff VFl(if E a then E\\P ^ /?) (2) 

The meaning of triple aPf3 can also be formalized in logic C, as 

^aPP iS ^ a AV^P (3) 

or equivalently, by (1), as 

h aPP iff P) (4) 

The semantical and logical characterization of validity of aPP are (almost) 
equivalent: 

Lemma 1. \= a AP ^ P implies 'iE{if E ^ a then E\\P ^ P). 

Proof. If if ^ a then E\\P \= a AP, hence E\\P \= p. □ 

Lemma 2. If logic C is realizable, in the sense that any formula a € C is 
a characteristic formula of some process, then \/E{if E |= a then E\\P ^ P) 
implies \= a AP ^ p. 

Proof. Take as E the process characterized by a, then E\\P \= P is equivalent to 
|a] n |P] C 1/3] which is equivalent to \= a AP ^ p. □ 

In order to resolve the “almost” issue, in the rest of the paper we assume 
that C is realizable, and thus the meaning of aPP is defined equivalently by (2), 
(3) or (4). 

3.2 Strongest Post-Condition and weakest Pre-Condition 

Intuitively, the strongest post-condition of a w.r.t. P, denoted by aP, is the 
strongest property guaranteed by P, when P is executed in the environment 
which has property a. Dually, the weakest pre-condition of P w.r.t. P, denoted 
by Pp, is the weakest property of the environment, in order for P to guarantee 
property P- ^ ^ 

Formally, aP and P/3 are formulae such that \= aP — >■ /3 iff |= aPP, and 
\= a ^ PP iff \= aPp. From (3) and (4) it follows that 



aP = a AP 


(5) 


^(3 = ^^ 13 


(6) 



Notice that P and P form a well known Galois connection (by the equivalence 
\= aP — >■ /3 iff 1= a — >■ P/3). Also A and — >■ form a well known Galois connection 
(by the tautology (1)). In the presented framework, both Galois connections 
coincide, and the simplicity of our calculus follows from the fact that we can use 
them interchangeably. 




Compositional Verification of Synchronous Networks 219 



3.3 The Inference Rule 

The following rule is both sound and complete: 

aPj, -fQ(3 

a{P\\Q)f^ 

The soundness of the rule follows directly from (3), by a simple propositional 
reasoning to justify that \= a A V A Q ^ [3 follows from \= a A V ^ j and 
h 7 A Q /3. 

For completeness it suffices to assume |= a{P\\Q)P and take as 7 either aP 
or Q(3. For example, let 7 := cxP = a AV. Then, aP^ is valid (as equivalent to 
the validity oi a AT ^ a AT) and 7 Q /3 is valid (as equivalent to the validity of 
a AP A Q ^ (3 which follows from the assumption \= a{P\\Q)P). 

3.4 Proof Strategies 

In this section, we analyze the well-known proof strategies for establishing va- 
lidity of a{P\\Q)f3. The analysis can be done in a very simple way, using the 
simple logical framework we have developed so far. 

Directly from definition (3), 

^a{P\\Q)/3 iff ^aAPAQ-^p 

and formula a AP A Q ^ P is equivalent, again by (1), to the following formulae: 

(monolithic) P A Q — >■ (a — >■ /3) 

(forward) ((a AP) A Q) ^ P 
(backward) a — >■ (P — (Q — >■ /?)) 

The first formula represents a monolithic proof strategy since it can be rewrit- 
ten as P||(5 ^ a — >■ P, and this can be proved by using a model checker for the 
model P\\Q and property a — >■ /3. 

The second formula represents a forward proof strategy since it can be rewrit- 
ten as ((a?)^) — >■ P, and this represents the method of pushing a forward, 
towards P, via the chain P\\Q. 

The third formula represents a backward proof strategy since it can be rewrit- 
ten as a — >■ (P{QP)), and this represents the method of pulling P backward, 
towards a, via the chain P\\Q. 

The three strategies seem to be equally difficult since the formulae which 
characterize them have the same complexity, in terms of their size. However, 
the forward and backward strategies can easily be improved, and thus have an 
advantage over the monolithic strategy. 

In case of the forward proof, instead of forming the whole implication ( (a A 
P) A Q) — >■ /3 in one step, one can first form a AP (thus, aP), simplify it to 
some 7 , then form 7 A Q (thus, ■jQ), simplify it to some 7 ', and finally form the 
simpler implication 7 ' — >■ p. 




220 



L. Holenderski 



Similarly, in case of the backward proof, instead of forming the whole impli- 
cation a — >• (P — >■ (Q — >■ /?)) in one step, one can first form Q — >■ /3 (thus, Q/3), 
simplify it to some 7, then form P — >■ 7 (thus, P7), simplify it to some 7', and 
finally form the simpler implication a — >■ 7'. 

In Section 5 we show how to incrementally use circuit optimizers as the 
simplifying devices. Such optimizers can be perceived as automated theorem 
provers in propositional logic. They are usually incomplete, due to the use of 
more efficient, non-exponential, algorithms. 

The improved forward and backward strategies are very similar. However, the 
backward proof turns out to have an advantage over the forward proof, under the 
verification scenario presented in the Introduction. This is quite obvious since 
the forward proof is usually too general (a is too strong), quite unnecessarily. 



4 Synchronous Observers 



In the rest of this paper we will analyze the backward proof strategy in the 
context where both a and /3, in aP(3, are synchronous observers [12]. (Note that 
our formalization of observers is much simpler than the one given in [12].) 

An observer is a pair (O : o) where O is a process with a distinguished output 
o. Observers specify safety properties of processes. Intuitively, process P satisfies 
property (0:o) iff P\\0 always emits o. Formally, 



P^iO-.o) 



iff P||OhDo 
iff ^VAO^Uo 
iff ^p^(Cl-^Qo) 

iff p\^o^no 



Thus, the property specified by the observer (O : o) can be defined by formula 
O ^ \Z\o. We call this semantics a weak interpretation of observer (0:o), and 
denote the formula by (0<io). Another semantics of observer (0:o) can be 
given by formula O A Go, denoted by (0>o). We call this semantics a strong 
interpretation of observer (0:o). 

We use strong observers to specify assertions and weak observers are used to 
specify commitments. In other words, we only consider triples a)P{B <i6), 
for some observers {A: a) and (B:b). Such triples are characterized by formula 



A A\A\a AP — ^ {B — )■ G ^) 7 



or equivalently, by 

AAV AB~A (Go^G^)- 

This particular choice is one of the main reasons for the simplicity of our 
calculus, as manifested in the sequel. 




Compositional Verification of Synchronous Networks 221 



4.1 The Strongest and Weakest Observers 

It turns out that the construction of the strongest post-condition aP and the 
weakest pre-condition P(3 is surprisingly easy when a and f3 are observers and 
P, a and /3 are specified by circuits. This is in contrast with the methods of 
synthesizing the weakest-precondition via manipulations on Mealy machines, as 
presented in [12,5]. 

Consider the following calculations: 

{0>o)P= {O A\Jo) AP = (OAP)ADo= (0||Pt>o) 

P{0<io) = P ^ {O ^Bo) = {P AO) ^\Jo= {P\\0<io) 

The calculations show that the same construction, namely the observer {0\\P: o) 
obtained by putting process O in parallel with process P, leads to both tl^ 
strongest post-condition and the weakest pre-condition (provided one applies P 
to a strong observer and P to a weak observer) . Notice that since P and O are 
specified as circuits, we could equally well write (O U P :o) instead of (0||P:o), 
and thus the cost of this construction is linear in size of P and O. 

5 Backward Proofs with the Weakest Observers 

The proof of ^ (A o a){P\\Q){B < b) can proceed as follows: 

— Form the weakest observer 

Qj3 := {Q\\B < b) 

and simplify Q\\B to Q'\\B' , using some sequential circuit optimizer, such 
as [19]. In addition, one can use some special-purpose algorithms, for exam- 
ple [17,16], to further reduce the number of latches in Q\\B. 

— Form the weakest observer 

P(g/3) :=P(Q'||P'<&) = {P\\Q'\\B' <b) 

and simplify P\\Q'\\B' to P'||g"||P", as above. 

— Prove ]=«—>■ P{QP) by using either a tautology checker on formula 

AA\Ja^ {P' AQ" A B" -A\Jb) 
or a model checker on 

A\\P'\\Q”\\B” 

In the above scenario, the assertion {A > a) is only used at the end of a proof. 
If Qa was used during the optimization, as an additional “don’t care” condition, 
the optimization of the weakest observers constructed in the backward proof 
could lead to more simplifications. Such an improvement can easily be derived 
in our calculus, as shown below. 




222 



L. Holenderski 



Let (D:di,d2) denote a double observer which is a process D with two di- 
stinguished outputs di and ^2- Its (weak) semantics is denoted by (D <i di, ^2)5 
and is defined by the formula P — >■ (Ddi — >■ □^2)- 

It is easy to check that the weakest double observer can be constructed in 
the same way as the weakest single observer: P{D < ^1,^2) = {P\\D < ^1,^2), for 
any process P. It is also easy to check that \= {At> a)P{B <b) is equivalent to 
\= {true)P{A\\B < a,b). Thus, the proof of \= {At> a){P\\Q){B <b) can proceed 
as above, by pulling backward (^| |S < a, b) through the chain P\\Q, and optimize 
the intermediate circuits under the “don’t care” condition Qa. 



6 The Software Tool 

The software tool which might implement our incremental method is quite sim- 
ple. It is fed with the formula {At> a){Pi\\ ■ ■ ■ \ \Pn){B <b), given as n -I - 2 files 
which contain descriptions of respective circuits, and returns either a confirma- 
tion that the formula is valid, or a counter-example trace if the formula is not 
valid. 




user 



Fig. 1. The software tool 



The tool consists of four main software components arranged in the struc- 
ture depicted in Fig. 1 . The input circuits are read in by CircuitReader and 
converted to the form suitable as input to CircuitOptimizer. Coordinator im- 
plements the loop in which the successive weakest observers are formed and fed 
to the optimizer. The loop is finished if either the optimizer returns the “empty” 
observer true, in which case the input formula is valid, or there are no more pro- 
cesses to be processed. In the later case, the coordinator invokes the tautology 
(or model) checker (component t-checker/m-checker) and reports the result. 

Notice that this simple architecture allows to parameterize the tool, in a rela- 
tively easy way, with different combinations of CircuitOptimizer and checker. 








Compositional Verification of Synchronous Networks 223 



The tool can be used to experiment with different heuristics (expressed as 
scripts for hardware optimizer) for improving efficiency of proving a{P\\Q)(3, 
compared to the usual monolithic strategy. 



7 Possible Optimization Strategies 

The incremental method of constructing the weakest observer of a network is 
sensitive to the order in which processes are applied and to the optimization 
strategy. 

The order in which processes are applied to construct the intermediate wea- 
kest observers may influence their size, much the same as the order of variables 
may influence the size of BDDs. Usually, the process which influences the most 
variables appearing in the initial observer should be applied first since this will 
usually lead to maximal reductions during the optimization phase. The second 
process should influence the most variables appearing in the optimized observer 
obtained in the previous step, and so on. 

The optimization strategy should try to minimize the number of latches, 
according to the common belief that the number of state bits in a model is 
the most important factor which contributes to the state explosion problem in 
model checkers. So far we have only considered two optimization algorithms: the 
Incompatible Sets algorithms for removal of redundant latches, as described in 
[17,16], and retiming. 

The Incompatible Sets algorithms were designed for optimizing circuits ob- 
tained from imperative synchronous programs, written in Esterel and Argos, 
for example. Their efficiency in removing latches is due to the fact that cur- 
rent compilers for imperative synchronous languages employ quite simple state 
encoding techniques, usually the hot-state encoding, and thus introduce many 
redundant latches. Encouraging experimental results are well documented in 
[17,16], and we hope that applying the algorithms incrementally should in many 
cases produce even better results. 

On the other hand, the Incompatible Sets algorithms do not seem to be 
much helpful in optimizing the number of latches in declarative (data flow) 
synchronous programs, where there are no explicit control states to encode, and 
instead, latches are introduced to implement the delay operators. In this case, 
retiming can be helpful in removing redundant latches, as explained below. 

In what follows, we use pre to denote the delay operator. (This notation 
is borrowed from Lustre where pre(e) is an expression which gives the value 
of expression e in the previous instant.) Recall that pre distributes w.r.t. most 
operators, and this can be used to optimize the number of latches needed to 
compute complex expressions which involve pre. For example, pre(a) Vpre(&) = 
pre(a V h), and thus only 1 latch is needed for expression pre(a) V pre(&). 

Since the retiming algorithm can be perceived as exactly this kind of sim- 
plifier, it can often substantially reduce the number of latches in declarative 
programs with complex expressions involving pre. Although a programmer will 




224 



L. Holenderski 



seldom write such unoptimized expressions himself, nevertheless they may ap- 
pear implicitly, as a result of compilation. This is illustrated by the following 
example. 



7.1 An Example of Optimization by Retiming 

We consider a simple example of a pipelined computing device (it comes from a 
very high level abstraction of a pipelined microprocessor) . 

Let comp (which stands for component) denote a process which gets a vec- 
tor of bits (the input word) and produces a vector of bits (the output word). 
Both words have the same length w (which stands for width), and the bits are 
numbered from 0 to w — 1. If in is the input word and out is the output word 
then 

out[i] = pre(-iin[i — 1]), for 0 < i < m 

where -■ is negation and in[— 1] = in[w —V\. In other words, in is rotated by one 
bit, inverted, and delayed by one clock cycle. The component contains w latches, 
of course. 

Let pipe denote the process consisting of d component processes (d stands 
for depth) such that inputs of component i are driven by outputs of component 
i — 1. In data flow notation, the pipe can be specified as 

out = comp{comp{. . . comp(in ) . . .)) 

^ “V ^ 

d times 

The pipe contains d*w latches, of course. 

Let / be some boolean function on a word of bits. In our example / is simply 
the ru-wide “or” , and the property, or rather its commitment part, is 

f{out) A -<f{out~^) 



where 

out~^ = comp{comp{. . . comp{in ) . . .)) 

' V " 

d—1 times 

The example is used to illustrate how the incremental simplification can 
automatically discover that the property which syntactically depends on w * d 
latches, apparently (i.e., semantically) depends only on d latches. 

The weakest observer of /3 w.r.t. pipe was constructed incrementally, going 
backwards through the components of the pipe, as described in Section 5. During 
this process, successive pre-observers were optimized using the retiming proce- 
dure from the SIS package [19]. The final weakest observer had only d latches. 

We conducted several experiments on SPARCstation 5 with 64MB memory, 
using SIS version 1.2, with retime script retime -nm -c 10000.0. Each experi- 
ment consisted in executing a Tcl script which for a given pair (w, d) constructed 
the circuits for pipe and the initial observer (as BLIF files), and then invoked 




Compositional Verification of Synchronous Networks 225 



w\d 


4 


8 


16 


4 


6 


2 


9 


3 


19 


17 


8 


6 


3 


10 


18 


21 


266 


16 


7 


17 


13 


261 


30 


>3600 




incr 


mono 


incr 


mono 


incr 


mono 



Table 1. Optimization by retiming 



SIS (1 time for monolithic optimization, and d times for incremental optimiza- 
tion). The results are summarized in Table 1 which gives the time, in seconds, of 
running one experiment (rows correspond to w and columns correspond to d). 

As expected, for small w * d, the incremental optimization is slower than the 
monolithic one, due to the overhead caused by many invocations of SIS, and 
construction of weakest observers. However, for bigger w * d the incremental 
retiming was much faster than monolithic retiming. In fact, we were not patient 
enough to succeed in performing the monolithic retiming for m = 16 and d = 16. 



8 Possible Extensions 

Local signals can simply be treated as global ones (with the obvious restriction 
that they cannot be used in the assert-commit specification) . This does not pose 
problems since the existential quantifier which models signal hiding can always be 
“factored out” as a universal quantifier, in the following way. The assert-commit 
formula a{P\\Q)f3 is formalized asaAPAQ^(3 while a(local x in P\\Q)P 
should be formalized as a A {3x.P A Q) — >■ /3. Although the existential quantifier 
does not, in general, distribute over conjunction, it does distribute in this case 
(no free x outside V A Q), and thus the two formulae are equivalent. 

In principle, one may also consider liveness properties, although of quite 
limited form. The logical framework is also sound for observers with special 
output signals interpreted with O instead of □ . Unfortunately, this encoding does 
not handle nested temporal modalities, and we are not aware of any hardware 
optimizer which could simplify circuits under such “eventual don’t cares”. 

In principle, both the simple logical framework and the incremental verifica- 
tion method via successive simplifications of the weakest pre-conditions can be 
lifted to asynchronous networks, provided one works with the semantics in which 
|P||Q] = |P] n IQ], as in TLA [13]. Unfortunately, the known problem remains 
how to simplify the weakest pre-condition P — >■ /?. The methods proposed, such 
as [4], are quite involved. 

It is not clear to us under what assumptions our simple calculus is sound 
w.r.t. the constructive semantics of synchronous networks [21]. We do not know 
any simple syntactical characterization of the set of processes for which |P] ]Q| = 
|P] n IQ], under this semantics. 




226 



L. Holenderski 



9 Conclusions 

We have presented a simple logical framework for analyzing compositional veri- 
fication of reactive networks. In this framework, the common forward/backward 
strategies for incremental verification, via the strongest post-conditions and wea- 
kest pre-conditions, can be derived formally, using just one propositional tauto- 
logy (1). 

Many ideas that we have presented are well known, when considered sepa- 
rately, and we do not claim any originality in this respect. In particular, we 
were influenced by [20] (the simple inference rule for P\\Q), [11] (the data flow 
semantics of synchronous networks) and [12] (synchronous observers). 

However, we would like to emphasize the following novel treatment of the old 
ideas. First, our approach is much simpler compared with [1,2,3], for example. 

Second, the explicit distinction between the strong and weak observer allows 
to simplify our calculus considerably, and may suggest that this distinction plays 
an important role. We derived the strong and weak observer purely syntactically, 
only later realizing that analogous concepts already appear, although somewhat 
hidden, in [12]. 

Third, the simple construction of the weakest precondition (i.e., P — >■ /3) as 
parallel composition of circuits is in contrast with the usual, and quite involved, 
construction on the level of automata (by mimicking -•P V /3) [12,5] or logical 
formulae [4]. This may suggest that the very compact representation of reac- 
tive systems as circuits, largely neglected in favor of automata, temporal logic 
formulae and HDDs, may be actually quite useful in verification. 

We have analyzed one example only, just to substantiate our claim that 
incremental hardware optimization can actually lead to substantial simplification 
of the verification process. To show any practical usefulness of the presented 
approach, much serious verification experiments must be conducted, and we 
consider this as a possible future work. 



Acknowledgments 

This research was supported by the SACRES project [18]). We are also grate- 
ful to Jorge Cuellar and Klaus Winkelmann (Siemens AG, Munich) for fruitful 
discussions. 



References 

1. M. Abadi, L. Lamport, Composing specifications, ACM Transactions on Program- 
ming Languages and Systems, 15(1):73-132, Jan. 1993. 

2. M. Abadi, L. Lamport, Conjoining specifications, ACM Transactions on Program- 
ming Languages and Systems, 17(3):507-534, May 1995. 

3. M. Abadi, G.D. Plotkin, A logical view of composition. Theoretical Computer 
Science, 114(l):3-30, 1993. 

4. H.R. Andersen, Partial Model Checking, Proceedings of LICS’95, IEEE Computer 
Society Press, 398-407, June 1995. 




Compositional Verification of Synchronous Networks 227 



5. A. Aziz, F. Balarin, R. Brayton and A. Sangiovanni-Vincentelli, Sequential syn- 
thesis using SIS, Int. Conf. on Computer-Aided Design ICCAD’95, 1995. 

6. G. Benveniste and P. Le Guernic, Synchronous programming with events and re- 
lations: the Signal language and its semantics. Science of Computer Programming, 
16:103-149, 1991. 

7. G. Berry and G. Gonthier, The synchronous programming language Esterel: design, 
semantics, implementation. Science of Computer Programming, 19:87-152, 1992. 

8. G. Berry, A hardware implementation of pure Esterel, Sadhana, Academy Procee- 
dings in Engineering Sciencies, Indian Academy of Sciences, 17:95-130, 1992. 

9. N. Halbwachs, Synchronous Programming of Reactive Systems, Kluwer Academic 
Publishers, Dordrecht, 1993. 

10. N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud, The synchronous data flow 
programming language Lustre, Proceedings of the IEEE, 79(9):1305-1321, Sep. 
1991. 

11. N. Halbwachs, J.-C. Fernandez and A. Bouajjanni, An executable temporal logic 
to express safety properties and its connection with the language Lustre, Sixth Int. 
Symp. on Lucid and Intensional Programming, ISLIP’93, Quebec, April 1993. 

12. N. Halbwachs, F. Lagnier and P. Raymond, Synchronous observers and the verifi- 
cation of reactive systems. Third Int. Conf. on Algebraic Methodology and Software 
Technology, AMAST’93, Workshops in Gomputing, Springer, Twente, June 1993. 

13. L. Lamport, The temporal logic of actions, ACM Transactions on Programming 
Languages and Systems, 16(3):872-923, May 1994. 

14. F. Maraninchi, Operational and compositional semantics of synchronous automa- 
ton compositions, CONCUR’92, LNGS 630, Springer- Verlag, 550-564, Aug. 1992. 

15. A. Poigne and L. Holenderski, On the Gombination of Synchronous Languages, Int. 
Symp. on Compositionality, COMPOS’97, LNCS 1536, State-of-the-Art Survey, 
Springer, 490-514, Sep. 1997. 

16. E. Sentovich, H. Toma, and G. Berry, Efficient Latch Optimization Using Incom- 
patible Sets, Int. Digital Automation Conf. DAC’97, Anaheim, 1997. 

17. H. Toma, E. Sentovich, and G. Berry, Latch Optimization in Circuits Generated 
from High-Level Descriptions, Int. Conf. on Computer-Aided Design ICCAD’96, 
1996. 

18. SACRES (Safety Critical Real-Time Embedded Systems), Esprit Project 20897, 
http : //www. tni . fr/sacres. 

19. SIS: a system for sequential circuit synthesis. Tech. Rep. UCB/ERL M92/41, 1-45, 
May 1992, ftp://ic.eecs.berkeley.edu/pub/Sis. 

20. C. Stirling, A complete compositional modal proof system for a subset of CCS, 
LNCS 194, Springer- Verlag, 475-486, 1985. 

21. T. Shiple, G. Berry, H. Touati, Constructive analysis of cyclic circuits, Int. Design 
and Testing Conf. IDTC’96, Paris, France, 1996. 




Modelling Coordinated Atomic Actions in 
Timed CSP 



Simeon Veloudis^ and Nimal Nissanke^ 

^ Department of Computer Science, The University of Reading, Whiteknights, P.O. 
Box 225, Reading RG6 6AY, UK, 
s . veloudisSreading .ac.uk 

^ School of Computing, Information Systems and Mathematics, South Bank 
University, 103 Borough Road, London SEl OAA, UK, 
nissankeSsbu .ac.uk 



Abstract. This paper proposes a formal framework for modelling the 
interaction of concurrent items of equipment in real-time safety-critical 
systems and reasoning about their behaviour abstractly. The framework 
is based on the concept of Coordinated Atomic (CA) actions, an ap- 
proach widely used for structuring complex activities in fault-tolerant 
computer systems. It advocates a hierarchical approach and begins with 
the construction of a mathematical model of the behaviour of an indivi- 
dual item of equipment. Later on, the model is extended to incorporate 
the concept of a CA action. In the final stage, a formal representation of 
the ideal behaviour of an abstract CA action is provided. The framework 
uses Timed CSP - a well-established formalism used for representation 
and reasoning in real-time systems. 

Keywords: CA actions. Timed CSP, safety-critical systems, real-time 
systems 



1 Introduction 

This paper proposes a mathematical framework for modelling the interaction 
of items of equipment in safety-critical systems. The framework is based on 
an approach originally presented in [5] using CSP and extended later in [11] 
to real-time systems using Timed CSP. The latter work proposes a hierarchy of 
abstractions of individual items of equipment, embodying their ideal and failure- 
prone behaviours both with and without sensors at equipment sites. Sensors have 
similar abstractions to the items of equipment. This paper summarises the work 
presented in [10] which further extends [11] through the incorporation of the 
concept of Coordinated Atomic (CA) actions. CA actions [7,14] constitute a 
unified approach to both structuring complex concurrent activities and suppor- 
ting fault tolerance in a system of interacting processes. CA actions provide a 
conceptual framework for dealing with different kinds of concurrency (e.g. coope- 
rative, competitive) and achieving fault-tolerance by extending and integrating 
two complementary concepts: conversations [6] and transactions [2]. They are 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 228-239, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Modelling Coordinated Atomic Actions in Timed CSP 



229 



thus well-equipped for managing the complexity inherent in safety-critical sy- 
stems [16,17]. Although CA actions are aimed at supporting fault-tolerance, 
our concern in this paper is with the construction of a mathematical framework 
for the representation of CA actions that comprise ideally behaving participants. 
This is undertaken with two objectives in mind. Firstly, such a framework allows 
a better understanding of a fundamental and highly complex software enginee- 
ring concept, namely, that of a CA action. Secondly, although not sufficient for 
dealing with failures fully by itself, the work reported here forms the basis of a 
more general model [12] incorporating mechanisms for supporting safety. 

The paper is structured as follows. Section 2 outlines the structure of CA 
actions. Section 3 presents a behavioural model of an item of equipment. Sec- 
tion 4 shows how CA actions can be incorporated into the model described in 
Section 3. In Section 5, a formal model for the representation of CA actions 
is constructed. Finally, Section 6 concludes the paper. The paper assumes the 
reader’s familiarity with Timed CSP [1,9]. 

2 An Outline of CA Actions 



This section outlines certain characteristics of CA actions essential for this paper. 
A CA action consists of a number of concurrently operating processes, referred 
to here as participants, grouped together in order to perform some coordinated 
activity. Within the scope of the action, the participants may freely interact 
with one another but must proceed independently of any other components in 
the system. With respect to other processes in the system, the sequence of ac- 
tions performed within a given CA action is considered atomic. A CA action 
commences its operation when all its participants have been ‘activated’ {entry 
synchronisation), and commits (ends) when each and every participant has com- 
pleted its operation {exit synchronisation). The CA action concept is intended 
to be recursive. That is, a complex CA action may consist of a number of nested 
CA actions enabling a better understanding of its structure. Once a nested ac- 
tion commits, all its participants continue their operations within the enclosing 
CA action. An action may commit only if all of its nested actions have already 
committed. 

We consider a safety-critical system to consist of a controlling system, a 
controlled system, and an operator who interacts with the latter via the former. 
Each request made by the operator to the controlling system for a certain service 
is represented as a top-level CA action involving all items of equipment of the 
controlled system relevant to the delivery of the requested service. This top- 
level action persists statically throughout the system’s operation but develops 
(decomposes) into nested actions dynamically as it progresses. As a result, each 
item of equipment always operates within a CA action context changing over 
time. 




230 



S. Veloudis and N. Nissanke 



3 Items of Equipment 

In modelling items of equipment, let Id denote a deterministic Timed CSP term 
that describes the ideal behaviour of an arbitrary item. Let it be specified as 

Id satp Spec{s,H) (1) 

Spec being a predicate describing all executions of an item that are deemed to be 
ideal. A detailed specification of Spec is unnecessary at this level of abstraction. 
However, the behaviour of an item can in general be failure-prone. This can be 
modelled with reference to a set B of special events called failure events, which 
are essentially synchronisations between an item and its fault environment [4], 
modelled by a term IZ. Such an environment is not the result of a deliberate act 
of design but of our inability to eliminate faults. Hence, a detailed specification 
of the behaviour of 7^ is a difficult, if not an impossible, task. As one would 
expect, failure events are outside the events that an ideal item would normally 
engage in. Thus, a{Id)C\B = 0. As shown in (2), at any point during its operation 
a failure-prone item P may fail non- deterministically by engaging in an event 
f G B, with its subsequent behaviour being described by the term F{f). 

P = IdV F{f) (2) 

/6B 

From its semantic definition, P never refuses a failure event at least up until the 
first occurrence of such an event. The behaviour of a failure-prone item Item 
can be modelled as shown in the definition below. 

Definition 1. Item = {{Id V F{f)) || TZ)\B 

feB B 

The failure events are concealed as they are not externally visible and, hence, do 
not require the environment’s cooperation. Furthermore, an item of equipment 
operates ideally exactly when Item = Id. 



4 Incorporation of CA Actions 

This section extends the model outlined in Section 3 to incorporate CA actions; 
full details can be found in [10]. 

Systemic Events. The dynamically changing CA action context of operation 
of an item is captured through a special set Egys of events called systemic events. 
These events are synchronisations between an item and the controlling system 
and are intended purely for altering the CA action context of an item. Esys is 
partitioned into two sets of events: E (events that initiate CA actions) and E 
(events that terminate CA actions) . In addition, the following two functions are 
introduced: sma : Egyg — »■ Act - total surjection which associates each systemic 
event with the unique action it either initiates or terminates {Act being the set of 




Modelling Coordinated Atomic Actions in Timed CSP 



231 



all action identifiers); and bme : E >— » E - total bijection which associates each 
e € E with the corresponding 'e € E. Formally, bme{e) = "e sma{e) = sma(e). 

We define the nesting structure of an item as a temporally ordered set of 
pairs of the form {t,m) G x Act), where m identifies an action which has 
been initiated at time t but is yet to end. In order to determine the operation 
of an item subsequent to the observation of a systemic event, along with the 
identifier of the newly entered action, the duration of the action may also be 
required. This duration may be determined from the nesting structure of the 
action. The function sco given below is intended for this purpose and determines 
the ‘subsequent context of operation’ of an item. 

Definition 2. sco : seq(M''' x Act) -G (R’*' x E^ys) -G seq(R“*' x Act) 

dom SCO = {u £ seq(R+ x Act) | Vfc,j G domw . 

k < j tstrip{u)[k] < tstrip{u)[j]} (3) 
dom sco(m) C {(t, e) G (M“'" x Esys) \ t > end{u)} 



sco{u){t, e) = {{t, sma{e))) if u = {) A e £ E 

sco{u){t,e) = u^ {{t,sma{e))) if e G F A sma(e) ^ ran m , , 

u' ii u = u' ^ {{t', m)) A e £ E A 

sma{e) = m for some u' , t' and m 

The first argument of sco represents the item’s nesting structure up until the 
time of occurrence of the (timed) systemic event specified in the function’s se- 
cond argument. The definition of sco preserves the requirement on CA actions 
according to which an action may terminate only if all its nested actions have 
already terminated. [Note that some of the mathematical notation follows [13]. 
Thus, for any sequence s and k G dom s, s[fc] denotes the fcth element of s. Sym- 
bol -O' denotes a partial function and tstrip{u) (see [1,9]) is a function application 
retaining only the timing information of a timed trace u.\ 

The Operational Environment. Occurrences of systemic events may be at- 
tributed to the interaction between an item of equipment and a term O represen- 
ting the operational environment, the former engages in a systemic event exactly 
when the latter engages in the same event. 

Definition 3. The interaction of an item of equipment and the operational 
environment is given by the term Item || O 

Esys 

O models the operational part of the item’s environment and is thus an abstrac- 
tion of the controlling system. The purpose of O is solely to organise the items 
of equipment into CA actions. In other words, it does not engage in any event 
which directly affects the state of any item of equipment. The main focus of this 
work is on the behaviour of the controlled system and its equipment. Therefore, 
we assume that O never fails. 




232 



S. Veloudis and N. Nissanke 



The executions of O are subject to a number of constraints, some of which are 
formulated below and others can be found in [10]. The first constraint requires 
that initially, at time 0, some systemic event is offered. Such an event, if accepted, 
initiates a top-level action within which the item operates. 

O satp 3e £ E . e £ a{s f 0) V (0, e) ^ H (5) 

The second constraint places an upper bound of A units of time, A being a 
strictly positive constant, on the rate at which systemic events may be observed. 

O satp V t £ R+, e G Egys • e £ a{s t 0 

V £ (t, t T Z\], e' £ Esys • g! ^ (j(s (6) 

It is an assumption in our framework that an ideally behaving item must be 
prepared to participate in any systemic event offered by O. In other words, each 
and every systemic event observed depends solely on the particular behaviour 
shown by O in response to a unique service request made by the underlying 
application. By restricting the timed trace s of O’s behaviour to E^ys, we obtain a 
temporally ordered set of systemic events that occur during the ideal operation of 
an item. This set of events may be determined a priori to match the requirements 
of the requested task. In addition, there is an one-to-one correspondence between 
a task request and the top-level action within which an item operates. The 
function T below and constraint (8) summarise this discussion. 

Definition 4. T : Egys TT where domT C E and 

ranT = {s G TT \ a{s) C Egys A begin{s) > 0 A 

y t >0 . s t t = {) V #{s t t) = #s t [t,t + A) = 1} ( 7 ) 

where TT denotes the set of timed traces. Given the observation of a syste- 
mic event that initiates a top-level action, the function T returns a temporally 
ordered sequence of systemic events. These events are observed in the ideal be- 
haviour of the item concerned within the given top-level action as it performs 
the required task. The sequence tstrip{Teo) contains all such observation times. 

O satp Asm =k (V e G Egys • e G a{s t 0) T{e) = (s]0)[i?sys) (8) 



where 

Asm = V t G M"*", e G Egys • 

e G cr(s t 0 ^ ([^ + oo) n [0, begin{s [E^ys t [^ + oo))) X Esys) 'T 

The predicate Asm above requires that, for any initially observed e G Egys, 
O may perform the systemic events determined by T(e), if its environment offers 
to participate in each such event at least up until the time of occurrence of the 
event. As it will be seen, this amounts to an ideally behaving underlying item. 
Symbol ^ denotes a strictly positive constant such that ^ < A. Its significance 
will become apparent once Definition 5 is introduced. 




Modelling Coordinated Atomic Actions in Timed CSP 



233 



CA Action— Dependent Operation. Operation of an item following the ob- 
servation of a systemic event depends both on the newly entered action and the 
duration of the action up until the observation time of the event. In specifying 
the item’s behaviour, it is therefore important that both these two factors are 
taken into account, thus requiring a redefinition of (1). For this, consider the 
observation of a systemic event at time t and let m and to ^ ^ be respectively 
the identifier and initiation time of the action entered at t. Then 

to) satp SpeCm,t-to(s,i^) (9) 

where — to) denotes the Timed CSP term that describes the ideal 

operation of the item from t onwards, while the predicate SpeCm,t-to describes 
all possible ideal behaviours of the item within the action m but after (t— to) time 
units into m’s duration. A detailed specification of SpeCm,t-to is unnecessary at 
the current level of abstraction. 

The following mutually recursive definition describes the (ideal) operation of 
a given item within the context of an evolving CA action. 

Definition 5. Id = eo : Ssys — >■ Reo where, 

Reo = {Xk{uk, eo) = Qk{uk, eo))o k G (domTeJ U {0} (10) 

such that Mo = ((0, eo)), Uk G dom sco and, for k G dom7)jg U {0}, 

Qk{uk,eo) = WAIT{^)] {Id°‘‘^\last{uk),tstrip{Teo)[k] - end{uk)) 

V 

e e H ) 

Xk+i{sco{uk){tstrip{Teo)[k + 1], e), eo) 

4ie yf bme{eo) SKIP) 



where Tefj is an abbreviation for T(eo). 

Initially, the item engages in a systemic event eo that commences a top-level 
action. Symbol S^ys denotes the set of all such events; trivially, Sgys C E. Subse- 
quently, the item continues its operation within this top-level action as described 
by term As in Definition 2, for any k, the sequence Uk gives the nesting 
structure of the item at time tstrip{TeQ)[k]. Upon the observation of a systemic 
event, as long as this does not terminate the top-level action, as seen in its ex- 
ternally visible behaviour the item pauses for a (small) strictly positive length 
of time ^ and then resumes operation within the newly emerged CA action. The 
^-long pause is intended for the item to adjust to operating according to the 
requirements of the new action. In addition, the ^-long delay has a semantic sig- 
nificance since it makes each recursive call time-guarded and, hence, guarantees 
a well-defined semantics for Id. Upon the observation of a systemic event which 
ends the operation of the top-level action, the item terminates successfully (by 
behaving like SKIP). This reflects an assumption in our model (stated in Section 
2) according to which an item may only operate within the context of only one 
CA action. 




234 



S. Veloudis and N. Nissanke 



5 Ideal CA Actions 

This section develops a formal framework for the representation of CA actions 
which comprise ideally behaving participants. 

Reconfiguration Events. Reconfiguration events are a special set of events 
Ej. intended for capturing the dynamically evolving nesting structure of a given 
CA action. They too have no direct effect on the state of any item. The obser- 
vation of a reconfiguration event triggers the initiation and/or the termination 
of one or more CA actions. However, as explained earlier, a given CA action 
m G Act starts/ends its operation only when at least two items (its partici- 
pants) synchronise with O on the same systemic event e such that sma(e) = m. 
Note that E^. (reconfiguration events) and E^ys (systemic events) are mutually 
exclusive, because the former is intended for altering the nesting structure of CA 
actions, while the latter for changing the CA action context of items of equip- 
ment. Clearly, there exists a correspondence between these two sets of events; 
this is captured by the injective function rms : Er ^ FE^ys- For any Cr G E^, 
rms{er) returns a non-empty set of systemic events, each one of which is to be 
performed collectively by a number of items in order to implement the reconfi- 
guration intended by Cr- As indicated by the function type of rms, each event 
from Er corresponds to a unique reconfiguration of the nesting structure of an 
action. As will be seen in the constraint (19), the observation of an Cr G Er 
coincides temporally with the offer of each of the systemic events in rms(er). 
The need of both systemic and reconfiguration events is justified in [10]. 



5.1 Structural Information About CA Actions 

Formalisation of the operation of an action requires several mathematical struc- 
tures for recording information concerning its participants and the nesting struc- 
ture of the action. 



Action Participants. Any action comprises at least two participants, each zth 
one of which must contain in its interface with O the systemic events E^y^ which 
initiate/terminate the action^. 

Definition 6. part : Act -G FEq such that, for m G Act, 

(i) sma~^(m) C n Elys and (ii) #part{m) > 2 

i^part{m) 

Eq being an index set. 

^ Since this section considers the operation of several concurrently operating items of 
equipment, in order to distinguish between different items the sets E, E, Egyg and 
A are indexed by the elements of an index set Eq. Obviously, E‘‘s partitions A; the 
same applies to other sets. 




Modelling Coordinated Atomic Actions in Timed CSP 



235 



Nesting Structure. The nesting structure of an action at a given time t can be 
determined from the set of reconfiguration events in which the action has engaged 
up to t. Let u denote a temporally ordered set of such timed reconfiguration 
events. The set of nested actions active at t may be derived by considering 
elements {t', e^) in u such that: a) tr gives rise to at least one CA action start 
systemic event tg, and b) there is no element {t", e'^) in u such that t" > t and 
gives rise to the event bme{es). In other words, our interest is in reconfiguration 
events which have initiated at least one nested action which is yet to terminate. 

Defiuitiou 7. For m G Act, 9™ : seq(R+ x E^) -P- P(K.+ x Act), where 
dom6»™={M G seq(R+ x Er) \ 

Vi,j G domrt . i < j tstrip{u)[i] < tstrip{u)[j]} (12) 

9'^{u)={{t,m) G (R'*' X Act) \ 

3 Cr G Er, Cs G E , {{t, Cr)) iu M A 6s G rms{cr) A m = sma{es) A 
V(t', e'r) G (K'*' X Er) . {{t' , e'r)) in u At' >t ^ 
bme{cs) ^ rms(e'r)} (13) 

Note that in between any two consecutive reconfiguration event occurrences, m 
undergoes no change in its nesting structure. Furthermore, let k™ denote the 
following function: 

Defiuitiou 8. For m G Act, k*" : seq(R.“'' x Er) -G FEq, where dom«:™ = 
dom6™ and 

k™(m) = {iG Eq\i^ (J part{a{{a)))} (14) 

aGB^{u) 

The set iA^{u) comprises participants of m which are not engaged in any nested 
action. 

luterruptiug Recoufiguratiou Eveuts. Not all reconfiguration events may 
interrupt the operation of a given action. For any m G Act, m is affected by the 
occurrence of a reconfiguration event Cr at time t if and only if rms(er) contains 
at least one systemic event Cg such that: a) if G E then Cg must belong to the 
alphabets of at least two of the participants of m operating at time t within m 
but not within any of m’s nested actions, and b) if G E then Cg must belong to 
the alphabet of each and every participant of an action m' nested immediately 
below m but not below any of m’s nested actions. Formally, 

Defiuitiou 9. For m G Act, S™ : seq(R+ x Er) -O- FEr, where domb'™ = 
dom0™ and 

b™(w) = {cr G Er \ 3 Cg G rms{cr) • Cg G E ^ part{sma{cg)) C n™{u) A 

Cg G E ^ sma(eg) G ct(6*’"(m))} (15) 

As will be seen in Definition 12, each sequence u in the domain of S™ consists 
of observed timed reconfiguration events that affect the operation of m. 




236 



S. Veloudis and N. Nissanke 



5.2 The Operational Environment Revisited 

Occurrences of reconfiguration events may be attributed to the interaction bet- 
ween a CA action and the term O; the former engages in a reconfiguration event 
exactly when the latter engages in the same event. 

Definition 10. For an arbitrary^ e G Er, m G Act, tm G the interaction of 

a CA action with O takes the form CAg „i(im) || (A 

^sys U Ej- 

Analogous to the case of ideally behaving items of equipment under syste- 
mic events (Section 4), it is an assumption in our framework that a CA action 
which comprises ideally behaving items must be prepared to participate in any 
reconfiguration event offered hy O. In other words, the occurrence of every recon- 
figuration event in the behaviour of such an action depends only on the particular 
behaviour exhibited by O. Analogous to the account given in Section 4 in relation 
to systemic events, for each top-level CA action the occurrences of reconfigura- 
tion events may be determined a priori. Similarly to function T of Definition 4 
and constraint (8), let us introduce below function and constraint (18). 

Definition 11. : Er >^-> TT 

domT*^ = {cr G Er I ^rms(er) = 1 A rms{cr) Q n ( 16 ) 

ieEq 

ranT*^ = {s G TT \ a{s) C Er A begin{s) > 0 A 

y t > 0 . s 1 1 = {) V #{s 1 1) = #{s t [t,t + A') = 1}(17) 

where A' is a strictly positive constant such that A' > max{Z\i | i G Eq}. 

It is to be noted that a reconfiguration event initiates a top-level action only if 
it gives rise to a unique systemic event which may be performed by all items of 
equipment; this justifies the definition (16) of the domain of T*". 

O satp R"*" X Er c ^ (y Cr G Er • Cr G a{s t 0) =k T^{Cr) = slO) [Er) (18) 

The antecedent M’*' x C H of the implication above corresponds to the as- 
sumption that O’s environment is always prepared to engage in any offer of a 
reconfiguration event. Furthermore, the constraint below states that any obser- 
vation of Cr G Er temporally coincides with the offer of each and every event 
from rms{er). 

O satp 'i Cr G Er, t G R'*’ . Cr G a{s ft) V 6s G rms{cr) . 

Cs G a{s t t) V 6s ^ (H t t) (19) 

A number of additional constraints on O’s operation may be found in [10]. 

^ As will be seen shortly, e is not really arbitrary, but must be within the domain of . 




Modelling Coordinated Atomic Actions in Timed CSP 



237 



5.3 A Representation of CA Actions 

Let us now define the Timed CSP term that describes the operation of a CA 
action under an arbitrary execution of the term O. 

Definition 12. Let CAtl denote a top-level action. Let us define it as 

CAtl = Sr : doxnr ^CA„a(eo){^) (20) 

where event eg is the element in the singleton rms(er). For arbitrary m G 
Act, tm G K.’*’ such that tm is the initiation time of m, 



CAe,,m(,tm) = {X^{er,tm,Uk) = {Sr, tm, Uk))o k G {d0U\Tl]tm) U {0} 
where mq = () and for k G dom7e(1 tm U {0}, 



5 tm 5 ) — 

I I I CAg^^^(t)) {^CTtdi^Uk) t) 



/j<n 



Itenia(^j)]{end{uk)) \ Aj 



V (21) 

6^5’" (life )U { (S<lsma) “ 1 (m)} 

(Xk+i(er, tm, Uk ^ {{time{er){e){tm){uk), e))) <l sma)~^{m) ^ rms(e) :)> 

SKIP) 



where a G (l..n >-^ K™{uk)), n = #(«;™(Mfc)). 

Note that each execution of O corresponds to a unique top-level action designed 
to deliver the service requested by the operator of the underlying application. 
The function time in the definition above returns the time of occurrence of 
a reconfiguration event which actually affects the operation of the action. Its 
definition is given in [10]. Symbol a denotes an arbitrary sequence of elements 
from K™{uk). This is required simply for the application of the indexed interface 
parallel operator (see [10]). We note that both E<\sma and E<\sma are bijections 
(symbol <1 denotes domain restriction; see [13] for the notation). The use of the 
event interrupt operator in each mutually recursive equation ensures that a CA 
action always accepts any offers of reconfiguration events from the term O and, 
hence, constitutes a suitable environment for that term. The concealment of 
events from Aj indicates that such events require only the cooperation of the 
participants of the action, thus maintaining the atomicity of the action. Finally, 
it is to be noted that the mutual recursion of Definition 12 does not have a 
well-defined semantics as none of its recursive calls are time-guarded. However, 
by placing CAe^m(tm) in its intended environment, namely with O, we end up 
in a well-timed term^. 

® This is ensured by the constraint (18). 




238 



S. Veloudis and N. Nissanke 



6 Related Work and Conclusions 

This paper presents a novel mathematical framework for the representation of 
CA actions for use in real-time safety-critical systems. As evident from [8,15, 
16,17], it is not the first time that the concept of CA action has been used in the 
development of safety-critical systems. However, with the exception of formalisa- 
tion of properties of CA actions in the form of preconditions and postconditions 
using temporal logic, formal studies on behavioural aspects of CA actions are 
limited. Consequently, rigorous reasoning about the adherence by a given action 
to these properties has not been possible. In [3], a CSP-based formalism, namely 
the ERT model of behaviour, is employed to model CA actions. However, such an 
abstraction is inherently untimed and hence inadequate for real-time systems. 

Despite the omission of failure-handling mechanisms due to limitations of 
space, the framework proposed in this paper constitutes one of the first steps 
towards understanding the complex issues encountered in the formalisation of 
CA actions within a context relevant to real-time safety-critical systems. In ad- 
dition, such a framework serves as the necessary basis for a more general model, 
as the one outlined in [12], embodying failure-handling mechanisms and capa- 
bilities for formal reasoning about safety. At the equipment level, our approach 
advocates two levels of abstraction. At the higher level, a formal model of the 
failure-prone behaviour of an item is presented. At the lower level, the item is 
considered within an environment which allows for the dynamic evolution of its 
CA action context. The CA action model provided is structured on the basis of: 
a) the behaviours of the action’s participants, b) the behaviour of the operatio- 
nal environment O, and c) a number of functions which capture the dynamic 
evolution of the nesting structure of the action. Constraints are imposed to en- 
sure that the requirements of the CA action concept are not violated. Finally, 
the choice of Timed CSP and, in particular, of the Timed Failures Model TMp, 
can be justified on the basis that such a formalism is especially suitable for mo- 
delling and reasoning abstractly of externally visible behavioural characteristics 
of concurrently executing processes. In addition, it is well equipped to deal with 
the complex timing constraints encountered in the design of real-time systems. 



Acknowledgements 

The authors wish to thank Dr Manoranjan Satpathy for his encouragement and 
the anonymous reviewers for their comments and suggestions. 



References 

1. J. Davies. Specification and Proof in Real-Time CSP. Distinguished Dissertations 
in Computer Science. Cambridge University Press, 1993. 

2. J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan 
Kaufmann, 1993. 




Modelling Coordinated Atomic Actions in Timed CSP 



239 



3. M. Koutney and G. Pappalardo. The ERT model of fault-tolerant computing and 
its application to a formalisation of coordinated atomic actions. Technical Report 
636, Department of Computing Science, University of Newcastle upon Tyne, 1998. 

4. Z. Liu and M. Joseph. Transformation of programs for fault-tolerance. Formal 
Aspects of Computing, 4(5):442-469, 1992. 

5. N. Nissanke, J. Pascoe, and A. E. Abdallah. Csp in safety critical systems design. 
Technical report. University of Reading Department of Computer Science, 1999. 

6. B. Randell. System structure for software fault tolerance. IEEE Trans, on Software 
Engineering, l(2):220-232, June 1975. 

7. B. Randell, A. Romanovsky, R.J. Stroud, J. Xu, A.F. Zorzo, Schwier D., and F. von 
Henke. Coordinated atomic actions: Formal model, case study and system imple- 
mentation. Technical Report 628, Department of Computing Science, University 
of Newcastle upon Tyne, 1998. 

8. A. Romanovsky, J. Xu, and B. Randell. Exception handling in object-oriented 
real-time distributed systems. Technical Report 624, Department of Computing 
Science, University of Newcastle upon Tyne, 1998. 

9. S. Schneider. Concurrent and Real-time Systems, The CSP Approach. Worldwide 
Series in Computer Science by David Barron and Peter Wegner. John Wiley & 
Sons, 2000. 

10. S. Veloudis and N. Nissanke. A formal framework for modelling interactions within 
a safety-critical system. Technical Report RUCS/2000/TR/002/B, University of 
Reading Department of Computer Science, 2000. 

11. S. Veloudis and N. Nissanke. Modelling abstract items of equipment - a formal 
framework. Technical Report RUCS/2000/TR/001/A (submitted for publication). 
University of Reading Department of Computer Science, 2000. 

12. S. Veloudis and N. Nissanke. Reasoning about safety in critical systems. Technical 
Report (to appear), University of Reading Department of Computer Science, 2000. 

13. J. Woodcock and J. Davies. Using Z - Specification, Refinement and Proof. Pren- 
tice Hall Series in Computer Science by Tony Hoare and Richard Bird. Prentice 
Hall, 1996. 

14. J. Xu, B. Randell, A. Romanovsky, C.M.F. Rubira, R.J. Stroud, and Z. Wu. Fault 
tolerance in concurrent object-oriented software through coordinated error reco- 
very. Technical Report 507, Department of Computing Science, University of Ne- 
wcastle upon Tyne, 1995. 

15. J. Xu, A. Romanovsky, R.J. Stroud, and A.F. Zorzo. Rigorous development of a 
safety-critical system based on coordinated atomic actions. Technical Report 662, 
Department of Computing Science, University of Newcastle upon Tyne, 1999. 

16. A.F. Zorzo. A production cell controlled by dependable multiparty interactions. 
Technical Report 667, Department of Computing Science, University of Newcastle 
upon Tyne, 1999. 

17. A.F. Zorzo, A. Romanovsky, J. Xu, B. Randell, R.J. Stroud, and I.S. Welch. Using 
coordinated atomic actions to design dependable distributed object systems. Tech- 
nical Report 619, Department of Computing Science, University of Newcastle upon 
Tyne, 1998. 




A Logical Characterisation of Event Recording 

Automata 



Deepak D ’Souza 

Chennai Mathematical Institute, 

92 G. N. Chetty Road, Chennai, India. 
deepakOsmi . ernet . in. 



Abstract. We show that the class of Event Recording Antomata [2] ad- 
mit a logical characterisation via an unrestricted monadic second order 
logic interpreted over timed words. We point out the closure properties 
corresponding to existential quantification in the logic. A timed tempo- 
ral logic considered earlier in the literature is shown to be expressively 
complete with respect to our monadic logic. 

The results in this paper extend smoothly to the class of event clock 
automata (also introduced in [2]). 



1 Introduction 

The timed automata of Alur and Dill [1] are a standard model for describing 
timed behaviours. They augment classical automata with clocks which can be 
read and reset while taking transitions. These automata are very powerful in 
language theoretic terms: while their languages are closed under union and in- 
tersection and their emptiness problem is decidable, their languages are not 
closed under complementation and their language inclusion problem is undeci- 
dable. Consequently, the verification problem — which is often phrased as whether 
L{Apr) C L{Aspec) — cannot be solved for these automata in general. One must 
then either work with deterministic specifications or with a restricted class of 
timed automata which has the required closure properties. 

The event recording automata of Alur, Fix, and Henzinger [2] are a subclass 
of timed automata which are both determinisable and closed under the boolean 
operations of union, intersection and complementation. The key feature of these 
automata is that they have an implicit clock associated with each action. This 
clock is reset with every occurrence of the associated action. This permits the 
modeller to record the time elapsed since the last occurrence of each action. As 
a result common real-time requirements like “consecutive requests are separated 
by a distance of at least 5 time units” can be naturally modelled using these 
automata. 

In this paper we argue in favour of these automata from a logical viewpoint. 
In the classical setting, the existence of a monadic second order logical charac- 
terisation for a class of languages is a strong endorsement of its regularity. Such 
a characterisation can also help in identifying natural temporal logics which can 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 240-251, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




A Logical Characterisation of Event Recording Automata 241 



be expected to have advantages in terms of relatively efficient algorithms for sol- 
ving their verification problem. As is well-known, Linear Time Temporal Logic 
(LTL) is expressively equivalent to the first-order fragment of SIS, Biichi’s mo- 
nadic second-order logic which characterises untimed regular languages. LTL is 
natural to use, and has an exponential time algorithm for deciding its satisfiabi- 
lity problem as against the non-elementary decision procedure for the first-order 
fragment of SIS. The aim of this paper is to show that event recording automata 
admit a similar logical framework. 

We characterise event recording automata via a monadic second order logic 
interpreted over timed words. This logic, called MSOer, has a timed modality 
of the form Aa{x) G I which asserts that w.r.t. the position x in the timed 
word, the time elapsed since the last a action lies in the interval I. The logic 
is unrestricted and in particular it allows full existential quantification over set 
variables (or monadic predicates). 

We further show that a timed temporal logic proposed earlier in the literature 
is expressively complete with respect to our logic in that it corresponds to the 
first-order fragment of our monadic logic. The logic, called here LTLer, has a 
timed modality of the form <aG I which asserts that the time elapsed since the 
last a action lies in the interval I. This logic has been studied earlier by Raskin 
in [10] and its satisfiability and model-checking problems are solved there. The 
issue of its expressive completeness was however not addressed. 

There have been several logical characterisations of timed automata and its 
subclasses proposed in the literature [13,9,10]. Unfortunately these logics are all 
restricted in their syntax, typically in their use of existential quantification. One 
of the first such characterisations was given by Wilke in [13], where he charac- 

terises the class of timed automata via the logic C d , the monadic second order 

logic of relative distance. C d has a restricted syntax due to the fact that timed 
automata are not closed under complementation. In particular, set variables used 
in a distance predicate must be existentially quantified only at the beginning of 
the formula. 

In [9] Raskin, Schobbens, and Henzinger propose the class of recursive event 
clock automata and a corresponding monadic logic called MinMaxML. Once 
again the second-order quantification is restricted as one cannot quantify over 
set variables which are in the scope of a distance operator. The authors also pro- 
pose a timed temporal logic called EventCLockTL that is expressively complete 
w.r.t. to MinMaxML. These results are shown in the setting of the so-called 
“continuous” time semantics [10], which turn out to be distinguishable from the 
more classical interpretation used in both [13] and this paper. 

In [6] a subclass of event recording automata called product interval auto- 
mata is studied. These automata admit a logical characterisation which com- 
prises boolean combinations of “local” monadic second order logic assertions. A 
corresponding timed temporal logic called TLTL® is identified which is expres- 
sively complete w.r.t. this characterisation. The study of these automata and its 
extensions lead us in a natural way to the class of event recording automata. 
The techniques used in this paper essentially build on the ones used in [6]. 




242 



D. D’Souza 



There is an interesting aspect of our characterisation of event recording auto- 
mata (which we elaborate on towards the end of Section 3). Existential quantifi- 
cation in a monadic logic usually corresponds to the associated class of languages 
being closed under the operation of projection, or renaming. Event recording au- 
tomata are however not closed under renaming, despite the fact that they admit 
an unrestricted logical characterisation. We explain this phenomenon by sho- 
wing that existential quantification in our logic actually corresponds to closure 
under renaming of a weaker class of event recording automata which we call 
quasi event recording automata. 

Finally, we would like to point out that the results presented in this paper 
extend easily to the class of event clock automata. These automata were also 
introduced in [2] and extend event recording automata with “event-predicting” 
clocks. Some further details on the extension of our results to this class can be 
found in [5]. 

2 Event Recording Automata 

Let N denote the set of natural numbers {0, 1, . . .}. We will use and K.-° to 
denote the set of positive and non-negative reals respectively, and Q-° to denote 
the non-negative rationals. As usual the set of finite and infinite words over an 
alphabet A will be denoted by A* and A^ respectively. 

A Biichi automaton over an alphabet A is a structure A = {Q, — Qin,F) 
where Q is a finite set of states, — Q x A x Q is the transition relation, 
Qin C <5 is a set of initial states, and F C Q is a set of accepting states. 

Let a G A“. A run of A over a is a map p : N — >■ Q which satisfies: p(0) G Qin 

and p{i) — ^ p{i + 1) for every i G N. We say p is an accepting run of A on a 
if p{i) G F for infinitely many t G N. The set of words accepted by A, denoted 
here (for reasons which will soon be clear) as Lsym{A), is defined to be the set 
of words in A“ on which A has an accepting run. We term a subset L of A“ 
w-regular if L = Lsym{A) for some Biichi automaton A over A. 

In what follows, we will concentrate on infinite timed behaviours. The results 
can be easily extended to cover finite timed behaviours as well. 

An infinite timed word over A is a member cr of (A x which satisfies 

the following conditions. Let a = (oq, to)(ai Ai) ■ ■ Then: 

1. For each t G N, < ti+i (strict monotonicity). 

2. For each t G there exists t G N such that ti > t (progressiveness). 

We use T A“ to denote the set of infinite timed words over A. 

For an infinite timed word a we will also use the representation of a as (a, rj) 
where a G A“ and ?7 : N — >■ 

In what follows, we will use intervals with rational bounds to specify timing 
constraints (and use oo as the upper bound to capture unbounded intervals). 
These intervals will be of the form {l,r), [l,r), {l,r], or [l,r], where l,r G Q-° U 
{oo} with I < r. For an interval of the form (I, r] or [I, r] we require r yf oo, and 
for intervals of the form [l,r) or [l,r] we require I 0. Further, to avoid empty 




A Logical Characterisation of Event Recording Automata 243 



intervals, unless an interval is of the form [I, r], we require I < r. An interval will 
denote a non-empty, convex subset of reals in the obvious way. For example the 
interval [l,oo) denotes the set {t G M | 1 < t}. The set of all intervals will be 
denoted by IR. 

An event recording automaton over an alphabet 17 is a timed automaton over 
S which has a clock Xa for each action a in S. With each transition — labelled by 
say a — the set of clocks to be reset is fixed: it is the singleton {xa}- Thus if the 

a, Q 

transitions of a timed automaton in general are of the form q q where q and 

q' are states of the automaton, a is an action from U, g is a guard (a boolean 
combination of atomic guards of the form (x G /)), and X is a set of clocks to 
be reset; then the transitions of an event recording automaton are of the form 
q — >q , since the set of clocks to be reset is understood to be {xa\- 

It will be convenient for us to define event recording automata in a slightly 
different manner, as it will help to make our arguments more transparent. The 
definition which follows can be seen to be equivalent to the one above. We first 
note that the guards can be canonicalised in the form Aaei:(^“ ^ 
not involve any loss of generality since a transition labelled by an arbitrary guard 
can be replaced by a collection of transitions, each of which is labelled by a guard 
of the above form. We can make use of the guards Xa G (0, oo) to model the fact 
that Xa does not play a role in the guard. With this in mind we introduce the 
notion of an interval alphabet. 

An ( event recording) interval alphabet based on 17 is a finite non-empty subset 
of i7 X . Thus, elements of an interval alphabet are of the form (a, J) with 
a G F7 and J : 17 — >■ HR. 

Let a G T17“ with a{i) = (ai,ti) for each t G N. We will use time\{a) to 
denote the time of occurrence of the last a action w.r.t. the position i in a. We 
will use the position —1 to denote the point at which we begin to count time. 
We define inductively 

— time~^{(T) = 0 for all a G 17 

,. i ( \ \ li if — n, 

^ \ time'^~^{a) otherwise. 

Let r be an interval alphabet based on 17. Let a G with a{i) = (ai,Ji) 
for each z G N. Then a induces in a natural way a set of timed words — denoted 
tw{d ) — as follows. Let a G TS‘^ with a{i) = (bi,ti) for each z G N. Then 
a G tw{a) iff for each z G N: 6^ = and for each a £ S {ti — time’‘~^{a)) G Ji{a). 

We extend the map tw to work on subsets of in the natural way. Thus, 
for L C we define tw{L) = tw{a). 

We are now in a position to define an event recording automaton and the 
timed language accepted by it. An event recording automaton over an alphabet 
17 is simply a Biichi automaton over an interval alphabet based on 17. Viewed as 
a Biichi automaton over an interval alphabet F, an event recording automaton 
A accepts the language Lgym (.4) C which we will call the “symbolic” langu- 
age accepted by A. However, we will be more interested in the timed language 
accepted by A', this is denoted L{A) and is defined to be tw{Lsym{A)). We say 




244 



D. D’Souza 



that L C TS'^ is a co-regular event recording language over E L = L{A) for 
some event recording automaton A over S. Thus, event recording languages over 
E are precisely those of the form tw{L), where L is an w-regular language over 
an interval alphabet based on E. 

3 Logical Characterisation 

The aim of this section is to characterise the class of w-regular event recording 
languages via a monadic second order logic interpreted over timed words. We 
call the logic MSOer-(^) and it is parameterised by the alphabet E. 

Here and in the logics to follow, we assume a supply of individual variables 
x,y,..., and set variables X,Y,.... These variables will range over positions 
(respectively sets of positions) of a given timed word. We will make use of the 

predicates Qa{x) (one for each a € E) and Aa{x) € /, where x is an individual 

variable, a € E and / is an element of IK.. The syntax of MSOer(I') is given by: 

(fi ::= (x e X) I (x < y) I Qa(x) | (Aa(x) & I) \ ^cp \ {ipV cp) \ 3xtp \ 3Xtp. 

A structure for a formula of the logic will be a pair (cr, I) where a G TE‘^ 
and I is an interpretation which assigns to each individual variable a position 
of (T (i.e. an element of N), and to each set variable a set of positions of cr. The 
predicate ‘<’ is interpreted as the usual ordering on N. 

The satisfaction relation a \=i cp for atomic formulas (p is given below. Let 
cr = (a, 77). Then 

a \=i{x G X) iff I(a;) G I(AT) 

CT 1=1 (a; < y) iff I(a;) < I(y) 

cr 1=1 Qa{x) iff a(I(a:)) = a 

a H {Aa{x) G I) iff ivilix)) - time^J:'^^~'^{a)) G I. 

The operator Aa{x) measures the time elapsed since the last a action w.r.t. the 
position X, and the predicate Aa(x) G I asserts that this value lies in the interval 

/. 

The operators V, and the existential quantifiers 3x and 3X are interpreted 
in the usual manner. In particular the quantifier 3X is interpreted as follows. 
Let I be an interpretation for variables with respect to cr. Let i G N. We will use 
the notation l[i/x] to denote the interpretation which maps x to i and agrees 
with I on all other individual and set variables. Similarly, for a subset S of N, the 
notation I[S'/A] will denote the interpretation which sends X to S, and agrees 
with I on all other variables. 

cr 1=1 3X<p iff there exists S' C N such that cr ip. 

Given a sentence p in MSOer(I’) we define L{p) = {a G T E‘^ \ a ^ p}. 

As an example, let E' = {a, r}. Then the following sentence (f) in MSOei.(A) 
asserts that consecutive requests (r’s) are separated by at least 5 time units: 
t/x{Qr{x) (Ar{x) G [5,00))). 




A Logical Characterisation of Event Recording Automata 245 



Theorem 1. Let L C TE^ . Then L is an co-regular event recording language 
over E iff L = L{ip) for some sentence ip in MSOer-(^)- 

We will devote the rest of this section to the proof of this theorem. The 
proof will factor through the well-known logical characterisation of w-regular 
languages due to Biichi [4] (see also [12]). Recall that for an alphabet A, Biichi’s 
monadic second order logic (denoted here by MSO(A)) is given as follows: 

ip ::= (a; G X) I (x < j/) I Qa(x) \ \ {pV p) \ 3xp \ 3Xp. 

A structure for this logic is a pair of the form (a, I) where a G and I 
assigns elements of N to individual variables, and subsets of N to set variables. 
The semantics of the logic is given in a similar manner to that of MSOer- In 
particular, the atomic formula Qa{x) — here a is required to be in A — is interpre- 
ted as follows: a Qa{x) iff a(I(a;)) = a. For a sentence p in MSO(A) we set 
L{v) = {cr G \ \= p}- Biichi’s result then states that a language L C is 

an w-regular language over A iS L = L{p) for some sentence p in MSO(A). 

Next, we introduce the notion of a proper interval alphabet which will play 
an important role in this paper. We say a finite set of intervals I C IM is proper 
if it forms a finite partition of K.^°. Thus, if I is a proper interval set, then for 
each t G there exists an / G I such that t G I, and for each 1,1' G I, I I' 
implies / fl /' = 0 . An interval alphabet T based on E will be termed proper if 
for each a G E the set Fa = {I \ 3(5, J) G T with J{a) = /} is a proper interval 
set. We say an interval set X covers an interval set X' if every interval in X' is the 
union of some collection of intervals in X. Finally, an interval alphabet X covers 
an interval alphabet X' (both based on E) if Xa covers X^ for each a G E. 

Each interval alphabet X induces in a canonical way a proper interval alpha- 
bet, denoted prop{X), with the property that it covers X. It is given by 

prop{X) = {(a, J) I a G FI and V5 G E, J{b) G prop(Xh)} 

where for each 5, the set prop{Xf) is obtained from J], by the procedure outlined 
below. 

Let I be a non-empty finite set of intervals (if it is empty, we simply set 
prop{X) = {(0, oo)}). Let V = {0,vi,V2, ■ ■ ■ , v„, oo} where for 1 < i < n, Vi G V 
iff there exists I G X with Vi as the left or right end of I. Without loss of 
generality, we assume that n > 1 and 0 < Vi < W 2 < • • • < oo. Now define 
propfX) via: 



prop{X) = {(0,ui)} U {[vj,Vj\, {vj,Vj+i) | 1 < j < n} 

where we set Vn+i = oo. It is easy to verify that prop(X) is a proper interval set 
which covers X. 

The following is an important property of proper interval alphabets. 

Proposition 2 Let X be a proper interval alphabet based on E. Then for each 
a G TE^ there exists a unique word a G X^ such that a G twfa). 




246 



D. D’Souza 



Proof. The proposition is easy to verify once we note that for each t G and 
aG S, there exists a unique I G Pa such that t G I. □ 

Now, given a formula (p G MSOer('S') we show how to translate it to a formula 
t-s{(fi) G MSO(T), for a suitably defined interval alphabet P. The translation will 
preserve — in a sense to be made precise — the timed models of p. (The name t-s 
is a mnemonic for “timed-to-symbolic”.) Let P be any proper interval alphabet 
over S such that for each a G S, Pa covers 

voCa{p) = {I \ p has a subformula of the form {Aa{x) G /)}. 

Note that {(a, J) \ a G S, and V6 G S, J{b) G prop(vocb(p))} is at least one 
such P. Then t-s(p) (w.r.t. P) is obtained from p by replacing sub-formulas of 
the form Qa(x) by the formula V(f, j)er b=a ^ind sub-formulas of the 

form Aa{x) G I hy the formula V(6,j)er. j(a)ci Q(b,J){x)- 

Lemma 1. Let p G MSOer(N') and let P be a proper interval alphabet based on 
S such that Pa covers voca(p) for each a G P. Let a G P‘^ and a G TP^ be 
such that a G tw{a). Suppose further that I is an interpretation for variables. 
Then 

1. a\=ip iff a H 

2. Lf p is a sentence, then L{p) = tw{L{t-s{p)) . 

Proof. (1) We prove the statement by induction on the structure of p. The 
interesting cases are p = Qa(x) and p = (Aa(x) G I). Let cr = (a,?]), and 
a = (ao, Jo)(ai, Ji) • • •• 

Case p = Qa{x): We know <j Qa{x) iff o;(I(a;)) = a. But since cr G tw{a), 
we know that this holds iff (?(I(a;)) = (a, J) for some J such that (a, J) G P. This 
in turn holds iff a |=i V(6,j')er, b=a Q{b,J')i.x). Thus, cr |=i (/? iff tj t-s{p). 

Case p = {Aa{x) G /): Let a {Aa{x) G I). Then we know that (? 7 (I(a;)) — 
time^^^^~^ {a)) G I. Further, since cr G tw{a), we know that 



(?7(I(a;)) - ^(ct)) G Ji(x){a). 

Using the fact that P is proper and covers voCa{p), it must be the case that 
•^i(a;)(a) C I. Hence cr |=i V(6,j)er,j(a)c/ Q(bU)(^)- 

Conversely, let a |=i V(&.j)er, j(a)<ziQ(b,J){x). Then ct(I(x)) = (6, J) for 
some (6, J) G P such that J(a) C L. Since a G tw{a) we have (r 7 (I(x)) — 
time]^^^~^ (a)) G J(a). Since J{a) C /, we have (r]{l{x)) — time^J;^^~^ (a)) G I, 
and hence a |=i (Aa{x) G I). 

(2) This is easy to see once we have (1) above. Let cr ^ i^. Then, again using 
properties of proper alphabets (Proposition 2), there exists a ct G such that 
cr G tw{a). Using (1) above, we have a G L{t-s{p)) and hence a G tw{L{t-s{p))). 
Conversely, if cr G tw{d) and a ^ t-s{p), then by (1) again we have that a \= p, 
and hence cr G L{p). □ 




A Logical Characterisation of Event Recording Automata 247 



Let r be an interval alphabet based on E. We now show how we can associate 
a formula s-t{ip) G MSOer(L7) with a formula tp G MSO(I^), such that, once 
again, the translated formula preserves timed models. The formula s-t((p) is 
obtained by replacing atomic sub-formulas in tp of the form Q(a,j)(x) by the 
formula Qa(x) A /\jg^(Z\h(a;) G J(b)). 

The following lemma is easy to show along the lines of Lemma 1: 

Lemma 2. Let F be a proper interval alphabet based on E and let p G MSO(T). 
Let a G F‘^ and a G TE‘^ such that a G tw(a). Suppose further that I is an 
interpretation for variables. Then 

1. a H s-t{p) iff a 1=1 p. 

2. Lf p is a sentence, then we have L{s-t{p)) = tw{L{p)). □ 

We are now in a position to provide a proof of Theorem 1. Let L be an 
w-regular event recording language over E. It is not difficult to see that there 
must be a proper interval alphabet F based on E, and an w-regular subset L of 

such that L = tw{L). Biichi’s theorem tells us that there exists an MSO(T)- 
sentence p such that L{p) = L. Hence L = tw{L{p)). By Lemma 2, we have a 
MSOei.(T')-sentence, namely p = s-t{p), such that L = L{p). 

Conversely, let p he & MSOer(T')-sentence. Let T be a proper interval al- 
phabet based on E such that Fa covers voca(p) for each a G E. Then, by 
Lemma 1, we know that there exists a formula p = t-s{p) in MSO(T), such that 
L{p) = tw{L(ip)). Using Biichi’s theorem once more, we know that L{fp) is an 
w-regular language over F. Thus L{p) is an w-regular event recording language 
over E. This completes the proof of Theorem 1. □ 

To conclude this section we point out the nature of the projection (or rena- 
ming) operation associated with existential quantification in our logic. In clas- 
sical monadic logics (as in SIS) it is usually the case that both open formulas 
and sentences correspond to languages in the same class under consideration. 
In the case of MSOer however, while sentences correspond to event recording 
languages, the open formulas correspond more naturally to a class of languages 
we call quasi event recording languages. The associated closure under projection 
is thus with respect to these languages, and not event recording languages which 
are not closed under projection. We will formalise these ideas below. 

Let C/ be a (possibly infinite) universe of letters (from which our alphabets 
will be drawn). Consider a finite partition of this universe, given by a function 
/ from the universe C/ to a finite indexing set X. 

Let us fix such a triple {U, f,X). Let A C [/ be an alphabet of actions. Then 
a quasi event recording automaton (qERA) over A (w.r.t. the triple (U,f,X)) 
is a timed automaton over A with a set of clocks X and the restriction that for 
every action a G A, the set of clocks reset along a transition labelled a is exactly 
{/(a)}. A qERA is thus a weaker form of event recording automata (it is not 
difficult to see that an event recording automaton over A can simulate a qERA 
over A). We will say L C TC/“ is an (w-regular) quasi event recording language 
(qERL) w.r.t {Lf, /, X) if L = L{A) for some alphabet A C U, and qERA A over 
A. 




248 



D. D’Souza 



Let A, A' be alphabets, with A, A' C U. A renaming from A to A' is a map 
from A to A'. We say : A — >■ A' is a valid renaming w.r.t. {U, /, X) iff for each 
a G A : f{a) = f{<;{a)) (i.e. both c(a) and a belong to the same block of the 
partition). For a timed word a G T A^ the timed word c(cr) G TA'‘^ is defined 
in the expected manner. The following proposition is then easily verified: 

Proposition 3 The class of quasi event recording languages over (U,f,X) is 
closed under renaming operations which are valid w.r.t. {U,f,X). □ 

If we now consider a direct proof of Theorem 1 along the lines of the one 
for MSO (see [12]), it will be clear that existential quantification in MSOe^. 
corresponds to the class qERL being closed under the restricted renaming defined 
above. We will concentrate on one direction of the proof where we show that 
every MSOer(L') sentence can be captured by an event recording automaton 
over S. This is done by associating, inductively, a qERA with each formula 
in MSOer(L'). Let ip(m,n) denote a formula whose free variables are among 
{x \, . . . , Xm, Ai, . . . , Xn}. A structure (a, I) for a formula (p{m, n) in MSOer(L') 
is encoded as a timed word over Em+n where for i > 0, Ei is defined to be 



A X {0,1} X ••• X {0,1}. 

Let a = {a,rj). Then (cr, I) is represented as a' G T(Am+n)“ with a' = (o', 77 ) 
where of is given as follows: for each i G N, of [i] = (a, bi, . . . Ci, . . . , c„) where 
a = a{i), bj = 1 iff I{xj) = i and = 1 iff t G I(ATj). The satisfaction relation 
a' ^ ip(m,n), is defined in the expected manner based on the semantics given 
earlier. Each formula (p{m,n) in MSOer(A) thus defines a subset of T(Am+n)“, 
denoted L{ip{m,n)), which is the set of models of (p. 

In the induction step for the “3A” case, we assume that the formula p{m, n+ 
1) is such that L{p{m, n+1)) C T is accepted by a qERA over Em+n+i, 

with respect to the triple (U,g,E) where g (restricted to Ui>o given by 

g{{a,di, . . . ,di)) = a. Then the set of models of the formula (3Xn+ip){rn,n) 
is simply obtained from L{(p(m,n + 1)) by projecting each letter of A^+n+i to 
the dimensions 1 to m + n. This projection is clearly a valid renaming w.r.t. 
{U,g,E) and by Proposition 3 the language L{{3Xn+i(p){m, n)) is also a qERL 
over Ejn+n- Note that in this way, for a sentence ip, the language L{p) is accepted 
by a qERA over E (w.r.t. (U,g,E)), which is nothing but an event recording 
automaton over E. 



4 Expressive Completeness of LTLer 

In this section we formulate a version of the timed temporal logic nrEventClockTL 
introduced in [10] and called here LTLer- The satisfiability and model-checking 
problems for LTLer are solved in [10] and shown to be PSPACE-complete. Here 
we concentrate on proving the expressive completeness of LTLer- 




A Logical Characterisation of Event Recording Automata 249 



Let E be an alphabet of actions. Then the formulas of LTLer{E) (parame- 
terised by the alphabet E) are given by: 

T I OaG I \ {a)(p | | -•(^ | ((^ V (/?) | 

Here we require a G E. 

The models for LTLer(T’) formulas are timed words over E. Let cr G TE^, 
with cr = (a, r]), and let i G {—1} U N. Then the satisfaction relation cr, i ^ is 
given by 

cr, i 1= T 

O’,* |=<loG I iff i > 0 and — time^~^{a)) G I 

<J,i \= (a) If iff a{i + 1) = a and a,i+ 1 \= Lp 

cf,i\= 0(p iff CT, i + 1 \= (f 

cf,i\= -'(p iff CT, i ^ 

cr, i 1= V (/?' iff CT, i 1= or cr, t \= (p' 

cr, i 1= ipU(p' iS 3k > i : a, k \= cp' and Vj : i < j < k, a,j ^ (p. 

We say a \= p iS. a,—l\= p. Define L{p) = {cr G TL7“ | cr \= p}. 

As an example, the LTL er(E') formula 

□ ((r)T 0{<lrG [5,oo))), 

where G\a is an abbreviation for -<{TU ~>a), rephrases the property (j) of Section 3. 

It will be convenient for us to combine the two modalities {a)p and OaG I 
into a single modality of the form {{a,J))p where a G E and J G The 

semantics of the new modality is given by 

<J^i\= ((a, J))p iff a(i + 1) = a, \/b G E {r]{i + 1) — G J{b), 

and CT, i + 1 \= p. 

The expressiveness of the logic can be seen to remain the same despite this 
change. 

Let FOer(T') denote the first-order fragment of the logic MSOei.(L'). Thus 
FOer(L') is obtained from MSOei.(L') by disallowing the use of quantification 
over set variables. The aim of the rest of this section is to prove the following 
result. 

Theorem 4. LTLer(L') is expressively equivalent to FOer(L'). 

The method of proof will be to translate LTLer formulas into classical LTL 
over an appropriate interval alphabet. The method is similar to the proof of 
Theorem 1 and we will also make use of the translation used there. 

It will be useful to first recall the definition of LTL and the result concerning 
its expressive completeness. Let A be an alphabet of actions. Then the formulas 
of the so-called “action-based” LTL, denoted LTL(A), are given by the syntax: 



p::=T \ {a)p \ Op \ ^p \ {p\/ p) \ {pUp). 




250 



D. D’Souza 



In the formula {a)ip we require a G A. The semantics of LTL(yl) is given similarly 
to LTLgr(T') above, with models being infinite words over A. In particular, for 
a word a € we have 

a,i\= {o)(p iff a{i + 1) = a and a, i + 1 \= ^p. 

We will say that a \= p lA p. We set Lsym{p) = {a G A‘^ \ a \= p}. 

Let FO(^) denote the first-order fragment of the logic MSO(A). As before, 
FO(A) is obtained from the logic MSO(A) defined in Section 3, by disallowing 
the use of set variables. Then a well known result due to the work of Kamp, and 
Gabbay et. al, is: 

Theorem 5 ([7,8]). LTL(A) is expressively equivalent to FO(A). □ 

Looking back at the syntax of LTLgr(L') formulas, we see that they are 
simply LTL(T) formulas for some interval alphabet F based on E. Of course, 
we must bear in mind that LTLgr(L') formulas are interpreted over timed words 
over S. Thus, a formula p G LTL(F) defines a language Lsym(p) F when 
interpreted as an LTL(T) formula, and it defines a timed language F(p) C 
when interpreted as an LTLer(L7) formula. 

The following lemma describes the relationship between these two languages: 



Lemma 3. Let F be a proper interval alphabet based on S. Let p be a formula 
in LTL(T). Then L{p) = tw{Lsym(p))- 

Proof. The proof of this is very similar to our earlier arguments which make use 
of the properties of proper interval sets. □ 

Returning now to the proof of Theorem 4, let po G LTLer(E). Then it is 
not difficult to see that we can construct a proper interval alphabet F based 
on E such that Fa covers voca(p) for each a G E, and a formula pi G LTL(T) 
such that L(pq) = L(pi). From Lemma 3, we know that L{pi) = tw{Lgy.ai{pi))- 
Now, by Theorem 5, we know that there exists a sentence p 2 in FO{F) such that 
L{P 2 ) = Lsymipi)- Now consider the sentence p^ = s-t{p 2 ) w.r.t. the proper 
interval alphabet F (cf. Section 3). The translations s-t and t-s are such that 
if the given formula is first-order, then so is the translated formula. Thus p^ 
is a FOer(L') sentence. Further, since F is proper, by Lemma 2 we know that 
L{p^) = tw{L{p 2 )). Thus (^3 is the required FOer(L') sentence with L(pq) = 
L{P3)- 

Conversely, let po be a sentence in FOer(L'). Then, once again, there exists 
a proper interval alphabet F based on A such that Fa covers voca(po) for each 
a G E. Consider the MSO(T) sentence pi = t-s{po) with respect to the interval 
alphabet F (cf. Section 3). By Lemma 1, L{pq) = tw{L{pi)). Further, pi is a 
sentence in FO(T). Now, again appealing to Theorem 5, we know that there 
exists an LTL(T) formula p 2 such that Lsym{p 2 ) = L{pi)- By Lemma 3, we 
know that L{p 2 ) = tw{Lsym{p 2 ))- Thus p 2 is the required formula in LTLer(E) 
such that L{p 2 ) = L{po). □ 




A Logical Characterisation of Event Recording Automata 251 



In conclusion we would like to point out that the results here (including 
Theorems 1 and 4) can be readily extended to the class of event clock automata. 
For this we will need to extend MSOer with a “predicting” modality (a;) 
which measures the time to the next occurrence of action a with respect to the 
position X. Correspondingly the logic LTLer can be extended with the operator 
>a as originally formulated in [10]. 

Acknowledgments: I am grateful to P. S. Thiagarajan for several useful 
inputs, and to Ramesh Vishwanathan for his comments on a draft of this paper. 



References 

1. R. Alur, D. L. Dill: A theory of timed automata, Theoretical Computer Science 126: 
183-235 (1994). 

2. R. Alur, L. Fix, T. A. Henzinger: Event-clock automata: a determinizable class of ti- 
med automata, Proc. 6th International Conference on Computer-aided Verification, 
LNCS 818, 1-13, Springer- Verlag (1994). 

3. R. Alur, T. A. Henzinger: Real-time logics: complexity and expressiveness, Infor- 
mation and Computation 104, 35-77 (1993). 

4. J. R. Biichi: Weak second-order arithmetic and hnite automata, Zeitschrift fiir Math. 
Logik und Grundlagen der Mathematik, 6, 66-92 (1960). 

5. D. D’Souza, T. Hune: An on-the-fly Construction for an Event Clock Logic, Ma- 
nuscript (1999). 

6. D. D’Souza, P. S. Thiagarajan: Product Interval Automata: A Subclass of Timed 
Automata, Proc. 19th Foundations of Software Technology and Theoretical Compu- 
ter Science (FSTTCS), LNCS 1732 (1999). 

7. A. W. H. Kamp: Tense Logic and the Theory of Linear Order, PhD Thesis, Univer- 
sity of California (1968). 

8. D. Gabbay, A. Pnueli, S. Shelah, J. Stavi: The Temporal Analysis of Fairness, Se- 
venth ACM Symposium on Principles of Programming Languages, 163-173, (1980). 

9. T. A. Henzinger, J.-F. Raskin, and P.-Y. Schobbens: The regular real-time langu- 
ages, Proc. 25th International Colloquium on Automata, Languages, and Program- 
ming 1998, LNCS 1443 , 580-591 (1998). 

10. J. -F. Raskin: Logics, Automata and Classical Theories for Deciding Real Time, 
Ph.D Thesis, FUNDP, Belgium. 

11. J. -F. Raskin, P. -Y. Schobbens: State-clock Logic: A Decidable Real-Time Logic, 
Proe. HART ’97: Hybrid and Real-Time Systems, LNCS 1201, 33-47 (1997). 

12. W. Thomas: Automata on Infinite Objects, in J. V. Leeuwen (Ed.), Handbook of 
Theoretical Computer Science, Vol. B, 133-191, Elsevier Science Publ., Amsterdam 
(1990). 

13. Th. Wilke: Specifying Timed State Sequences in Powerful Decidable Logics and 
Timed Automata, in Formal Techniques in Real-Time and Fault- Tolerant Systems, 
LNCS 863, 694-715 (1994). 




Using Cylindrical Algebraic Decomposition for 
the Analysis of Slope Parametric Hybrid 

Automata 



Michael Adelaide* and Olivier Roux * 

Institut de Recherche en Communication et Cybernetique de Nantes 



Abstract. We address the ambitious problem of synthesizing the para- 
meters which stand for the execution speeds in time constrained execu- 
tions of a real-time system. The core of the paper is a new method based 
on the Grobner bases and the so-called Cylindrical Algebraic Decompo- 
sition in order to design a simplification algorithm and a test inclusion 
upon sets of inequalities. The method is illustrated throughout the paper 
with a small example. 



1 Introduction 



Objective and Framework. The parametric analysis of a real-time system 
consists in computing some parameters of the system to ensure its cor- 
rect behaviour. More precisely, the purpose of our work presented in this 
paper is to attempt to achieve paramatric analysis of evolution laws in a 
hybrid automaton [ACD90,MMP92,AH94,AD94,AMP95,ACH"''95] which 
are intended to model the possible executions of a dynamical system. It 
means that, given a hybrid automaton for which the evolution speeds of 
one of its variables is unknown, we try to determine the set of speeds of 
the aforementioned variable which make it possible to meet some tempo- 
ral requirements. Unknowns are usually referred to as parameters in the 
literature, hence the name of Parametric analysis. The evolution speeds 
are indeed the slopes of the variables which are not constant but affine 
expressions on a set of parameters. 

* IRCCyN/CNRS UMR 6597 (1 rue de la Noe, BP 92101, 44321 Nantes cedex 03, 
France) e-mail: {Michael. Adelaide | Olivier .Roux}@ircyn. ec-nantes . fr 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 252-263, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Using Cylindrical Algebraic Decomposition 253 



Related works. This paper follows [BBRR97,BR97] and it goes further 
in the way we now deal with parametric polyhedra which are symbolic 
representations of the set of states of the system. We elaborated a sim- 
plification algorithm and an inclusion test for the systems of inequalities. 
This algorithm and this test are based on the Grobner bases [BW93] and 
the Cylindrieal Algebraie Deeomposition [Jir95]. 

Other works have already contributed to “parametric reasoning about 
real-time” [AHV93,HH95,Wan96], but the difference with our work is that 
they are mostly concerned with delays as parameters, while we study 
speeds as parameters. 

Outline of the Paper. The definitions are recalled and our method is 
illustrated throughout the paper owing to the small example of the control 
of a water-tank level. We first introduce, in section 2, the definitions of 
Slope Parametric Linear Hybrid Automata. Then, section 3 gives a quick 
overview of the key concepts of reachability analysis that will be used 
in section 4. This section is the core of the paper: we give an algorithm 
for simplifying systems of parametric inequations and an inclusion test. 
Section 5 provides with some results achieved from the example that 
illustrates the method. Eventually, in section 6, we conclude and give 
some directions for future work. 

2 Slope Parametric Linear Hybrid Automata 

2.1 Definitions 

The Slope Parametric Linear Hybrid Automaton (SPLHA) model is an 
extension of the Linear Hybrid Automaton (LHA) [Hen96] model. In 
SPLHA, a set of parameters K = {fci, ..., kg} is added to formulate the 
slopes of the variables. When the parameters are set to specific values, one 
obtains again LHA. We must focus the attention on the definition of the 
states for SPLHA. For LHA, a state is given by a node and values of the 
clocks in this node. For SPLHA, one must also consider the conditions C 
on the parameters. In the genaral case, the values of the clocks (given by 
the set X = {x\, ..., x^}) are set by parametric polyhedra. It is a set U of 
inequalities fj{k, x) > 0 which are linear over the set X of clocks, but not 
linear when considering the parameters {fj{k, x) = ajfi{k) + 

2.2 An Example 

In order to have a better understanding of Parametrie Slopes , let us take 
an example. We want to model a water tank (see Figure 1) the behaviour 




254 



M. Adelaide and O. Roux 



of which is a succession of filling and emptying operations. The water level 
X 2 is controlled by the clock xi. The clock rate is 1, and the parameter k 
is used to define the filling speed of the tank. 



Initial values : __ g^rd . x2^ 




xl~l.x2~lX Fillins X 


\ W Emptying ''\ 


^ Invariant : xl<=3\ 


/Invariant : x2>=(X 


! dxl/dt=l ( 


i dxl/dt=l ; 


\ |dx2/dt=k| / 


\ dx2/dt=-2 / 







guard : xl=3 
reset : xl:=0 



Fig. 1. Example 



All along the paper, we are going to show how the conditions that 
enable a correct behaviour appear, studing the states of the automaton. 
For instance, entering the ’emptying’ node, the states are given by the 
condition k > and the clocks by the polyhedron xi = 0 A X 2 = 1+2. /c. 

3 Outline of the Reachability Analysis 

We are interested in finding the conditions on the parameters that en- 
sure a correct behaviour of the automaton. Frequently, the problem is 
turned into a reachability one. It consists in finding the conditions on the 
parameters for which a state is reachable. This analysis has been first 
developed in [BBRR97,BR97,Bur98]. It is based upon both an analysis 
of the evolution of the automaton and a fix-point computation according 
to the following formulas: 

Reach{A) = Least Fix Point{Crossing{Start)) (1) 

Crossing{Q) = {d\3s gQ,s d} (2) 

All the operations are quantifier eliminations (QE) in real algebra. In 
the next paragraphs, we are going to define the operations done in the 
elementary step and the fix-point computation. 

3.1 The Elementary Step : Crossing an Edge 

For a given set of states Q, the elementary step computes all states d 
(destination) reachable from any sates s (source) which belongs to Q after 




Using Cylindrical Algebraic Decomposition 255 



one edge crossing. Showing the different steps of the analysis, we are going 
to underline the intermadiate conditions Ci and polyhedra Ui found from 
the start condition Cq true and polyhedron TIq (xi = X 2 = 1 ). 

The Extension Operation. The first operation is the extension one. 
It represents the time elapsing in the source node. It is expressed as 
the existence of a <5 predecessor. In other words, one can say : ’’Both 
the node-invariant and the entering condition polyhedron were satisfied 
5 time units ago, and the node-invariant is true now”. This is equivalent 
to the following formula: 

III ^ 35.(((5 > 0) A Invs{x — Vg* 5) f\ no{k, x — Vg * S)) A Invg{x) 

Since ili is the set of inequalities which stands for time-elapsing in the 
’filling’ node: 

III AA (3(5.(xi — (5 = 1 A X2 — k.5 = 1)) A {xi < 3)) (3) 

Hi ^ {1 < xi < 2) A X 2 = k.{xi — 1) -|- 1) (4) 

The Restriction Operation. The elimination of all linear variables is 
called restriction. This method is used twice in the analysis. 

First, when crossing an edge, the guard of the discrete transition has 
to be verified. It is solved by finding the conditions on the parameters 
than enable it. Then C 2 being the conditions to verify the guard : 

C 2 ^ (3x.77i(/c, x) A Guardg^d{x)) 

Thus, the condition C 2 and the polyhedron II 2 (Ri A Guardg^d{x)) 
give the new definition of the state at this stage. For instance, when the 
clock reaches 3 time-units: 

3 xi. 3 x 2 .(xi = 3 a IIi{k, x)) true 

Secondly when entering the destination node, its invariant has to 
be verified. Assume that after the reset of the needed clocks, the current 
polyhedron is 7 T 3 . Then one must find the parameters values that allow 
entering the destination node. They are given by the following quantifier 
elimination: 

C 4 3xi-. 3 xn-{Il 3 {k, x) A Invd{x)) 

In the example: 

C 4 ( 3 xi. 3 x 2 .(xi = 0 a X 2 = 2 * A: -|- 1 a X 2 > 0)) (/c > — -) 

^ chronologically the variables are reset before going in the node, which leads to the 
polyhedron II 3 . 




256 



M. Adelaide and O. Roux 



The Projection Operation. The projection operation is used to reset 
clocks Xi^...Xi^ when an edge is crossed. Reset clocks means first, be sure 
that the system has solutions in the clocks to reset, and secondly set to 
zero those variables: 

p 

Us {3xi^..3xi^.Il2{k,x)) A /\ (xjp = 0) 

k=l 

Before entering the ’emptying’ node: 

773 {3xi.Il2{k, x)) A (xi = 0) (xi = 0 A X 2 = 2.k + 1) 

3.2 The Fix-Point Computation 

We have shown that the ” Edge Crossing'^ parametric analysis is quantifier 
eliminations for parametric linear formulas. In this part, we depict the 
global algorithm. 

It is a fix-point computation. At each step, a new region (a re- 
gion is a set of states) is computed using the ’’Edge Crossing” analysis to 
find all the states reachable after no more than n edges crossing. It com- 
pletes when an integer n such that Rn+i = Rn is found. One can note 
that the reachability for S.P.L.H.A. is undecidable [Bur98], since nothing 
guarantees the termination of the algorithm. 

Consequently, semi-algorithms have been implemented to realize this 
function which computes the reachable region Rx- Up to now, the test 
of equality of regions and the systems solving over parameters (more 
precisely, finding the existential conditions on the parameters) had been 
implemented in using an approximate analysis [Bur98]. [Ade99] presents 
an approach to solve the equality of regions. This approach can be used 
to find the existential conditions on parameters and to simplify the po- 
lyhedra. In the next section, the two algorithms designed for this purpose 
will be described. 

4 The Simplification Algorithm and the Inclusion Test 

When entering a visited node, one must check if the new states have not 
been already found by a former analysis. This is the purpose of the inclu- 
sion test. Also, there is another goal to achieve. It consists in shortening 
the writing of the polyhedra because the QE methods increase the size of 
the formulas (in terms of inequalities) when they eliminate variables. We 
are going to show the simplification algorithm first. As a matter of fact, 
it gives an easy way to answer the inclusion problem. 




Using Cylindrical Algebraic Decomposition 257 



4.1 The Simplification Algorithm 

Given a parametric polyhedron, the simplification algorithm follows two 
goals: 

— Find the condition on the parameters such that the polyhedron is not 
empty 

— For each found condition, shorten the writing of the polyhedron. 

The main idea is to answer the two questions by solving non para- 
metric inequalities. It means that we must find a partition Ek of the 
space of the parameters and work on each cell of the partition taking any 
representative of the cell. To this end, the expected properties must be 
invariant on each cell. 



Introduction of the Distances. We first turn the original problem 
upon clocks and parameters into a new one introducing the ’’distances” 
^ to the hyperplanes. We write : 

{fj{k,x) 0) 3pj.{hj{k,p,x) = 0 A pj y 0) 

with hj{k,p,x) = fj{k,x) — pj and p is the vector {pi, ...,pm)- 

Therefore, using all polynomials fj, F{k,x) 0 is equivalent to 
3p.{H{k,p,x) = 0 A p 0). To obtain a new system, equivalent to the 
first one, in the unknowns p and k, we compute Gb{k,p, x) the Grobner 
basis of H{k,p,x) with a lexicographic order that satisfies the following 
property : any parameter is lower than any distance, and, any distance is 
lower than any variable, which is noted K A P A X (P he the list of pj). 

Grobner bases have been invented by Bruno Buchberger in 1965 [Buc]. 
A Grobner basis [DG092,BW93,Goh96] of a set of polynomials is another 
representation of the original one. Both systems have the same zeroes. 
They are the generalisation of the Gaussian transformation of a system 
into an upper triangular form for the linear case (when lexicographic 
orderings are used). A good introduction to Grobner bases is given in 
[Goh96]. 

Using the extension theorem [DG092] and the linearity of the original 
system H(k,p,x) = 0, the polynomials in k and p are extracted from 

^ the term ” distance” is quoted because what matters is not the exact value of the 
Euclidean distance but whether it is null or not. 




258 



M. Adelaide and O. Roux 



Gb{k,p,x). They make a system Gb\{k,p). And the equivalence ^ : 

(3x G JR^.{F{k,x) y 0)) ^ {3p G JR^.{Gbi{k,p) = 0 A p y 0)) 

turns the original problem upon parameters and variables into a new one 
upon parameters and distances. 

For instance, when entering the ’emptying’ node for the second time, 
the variables values are given by: 

, . 1 + 2. A:, 

X2 = k.{3 — ) A xi = 3 A X2 > 0 

Let hi = X 2 — k.{3 — — pi,h 2 = xi — 3 — p 2 ,hs = X 2 — ps- Then, 

Gb{k,p,x) = [—5.k + 2.k'^ — 2.pi + 2.ps,—3+xi+p2,2.X2 — 5.k — 2.k‘^ — 2.pi] 

Only the first polynomial of Gb has no coefficient in x, then has to be 
considered in the rest of the analysis (according to the extension theorem) . 

Partition of the Space of the Parameters using the CAD Algo- 
rithm. In the latter part, the problem upon variables and parameters 
has been turned into one upon “distances” and parameters : 

Gbi{k,p) = 0 A p + 0 

(For the example, we consider: —5.k + 2.k“^ — 2.pi + 2 .p 3 = 0 A pi = p 2 = 
0 A p 3 > 0). In order to obtain both the conditions on the parameters and 
the useless inequalities, a partition of the real space of parameters, 
is computed using the Cylindrical Algebraic Decomposition (C.A.D.). 

The C.A.D. has been introduced by Collins in 1975 [Col75]. Given a 
set of multivariate polynomials in IR[xi, ..., x^], it computes a partition 
of IR"" such that the polynomials have a constant sign (strictly negative, 
null, or strictly positive) over each cell [Jir95] , [Rod96] . The built partition 
has a tree-structure, each level of the tree corresponding to the partition 
of IR for one variable. Taking into account an ordering on the variables 

® The extension theorem requires to work in algebraic closed fields. Our study belongs 
to ordered real fields theory. Using extension theorem, one obtains a complex vector 
which the real parts of its coordinates verify the original system H (fc, p, x) = 0 (this 
system is linear upon x and p) 

In order to deal with the values of the the parameters {ki, kg}, a vector (fci, ...,kq) 
of parameters is used. It belongs to IR'^. Ek stands for rR"^ to make easier the 
distinction between the parameters-space, the variables-space and the distances- 
space. 




Using Cylindrical Algebraic Decomposition 259 



(xi < X 2 < ... < Xn for example), the algorithm proceeds in 3 steps. The 
first one is called the Projection-Phase. It consists in finding recursively 
systems of polynomials which permit to determine a partition of IR* given 
polynomials of IR[xi, .., Xj+i]. The second one, the Base-Phase builds 
the first partition (partition of IR). The last one called Extension-Phase 
builds a partition of IR*^^ given a partition of IR*. The complete algorithm 
is given in [Jir95]. We only explain the main ideas of the C.A.D. algorithm 
by analysing our example. 

We have to consider the polynomial g = —5.k + 2.k‘^ — 2.pi + 2.p3, for 
the ordered unknowns k < pi < p2 < p3. 

First, we ’’eliminate” ps ^ . We make the following analysis. The sign 
of g seen as a polynomial in ps depends on the polynomials proji = 
{—5.k+2.k‘^—2.pi,2}. Thus, suppose we have a partition oflR^ (unknowns 
[k, pi, P 2 ]) for the polynomials of proji. The cells of the partition of 
IR^ are obtained from each cell Csj, as follows ® : 

- C 43 A (2.ps < -5.k -b 2.P - 2.pi)) 

~ ^43,fc+i {Csi^ A (2.P3 = -b.k + 2*k"^ - 2.pi)) 

~ C^3.k+2 ^ (C'sfc A (2.p3 > -5.k + 2*k"^ - 2.pi)) 




Fig. 2. plane k,pi 



Finally, having eliminated p 2 andpi, and considering projs = {—5.k + 
2.k‘^, 2}, the C.A.D. computes the roots of the first polynomial 2 and then 
the following partition of IR: 

|-oo.0[U{0)U]0.5[u5)u|5,+»[ 

® The meaning ’’eliminate” is different here. We are looking for a CAD of IR® for the 
variables [k, pi, P 2 ] which can be extended to IR^ (taking into account ps) 

® Such operations are done in the Extension Phase which is the third step of the 
C.A.D. Computation. 




260 



M. Adelaide and O. Roux 



Return to linear systems. On each cell of partitioriK, the system 
{Gbi{k,p) = 0 A p y 0) has to be solved, where k is any representative 
of the cell [Ade99] shows that this system in P is equivalent to a linear 
one given by adding the sign conditions p 0 to the Grobner basis 
computation of Gbi{k,p), for any lexicographic order on P Therefore, 
this system is built and solved. 

In our example, the polynomials are still linear in the variables. Then, 
no more computation has to be done. One must solve — 5.A: + 2*/c^ — 2.pi + 
2.ps = 0 A Pi = p 2 = 0 A p 3 > 0 replacing A: by a representative of each 
cell. 



Simplification in the non-parametric case. Then, non parametric 
linear systems have to be solved. We only give the main idea to eliminate 
useless inequalities for the shortening problem; the existence of solution 
is solved by eliminating all variables. 

The argument to eliminate useless inequalities is to look if the border 
of the polyhedron intersects the hyperplans fj = 0 (cf. [Ade99], proofs 
can be obtained from the authors). If it does not, the inequality can be 
removed (in Figure 3, the line 5 is useless). 




^ We assume here that the exact values of the polynomial roots can be computed, 
which is not always possible. However, there are also techniques based on Sign 
Determination Scheme [GVRRT95] to solve the problem we are dealing with, when 
the exact values cannot be computed. 

® in a earlier footnote, we have noticed that Grobner basis computation of linear 
parametric systems does not lead to linear parametric system. What is claimed here 
is giving values to the parameters, and recalculating a Grobner basis leads to a linear 
system. 



Using Cylindrical Algebraic Decomposition 261 



For the example, there are solutions only for 0 < /c < | and no 
relations have to be eliminated. 

4.2 Inclusion Algorithm 

In this section, the problem A C B where A and B are both parametric 
polyhedra is tackled using the previous simplification algorithm, and the 
following equivalence, for a given polyhedron U : 

simplification(n) = 0 <t4 (II = 0) 

It means that the inclusion test has to be transformed into emptiness 
tests. It is done according to the equivalence: 

{A C B) ^ {Af]B = (H) 

Then, writing B = f]{gi >~ 0), one must check : 

Vi, simplification(AP|(gi^O)) = 0 

to obtain the inclusion algorithm. 

5 Results 

In our example, the algoritm never completes because for most of the 
values of the parameter, the behaviour is asymptotic. Nevertheless, if we 
look at the values of k that allow entering ’filling node’, they are in an 
infinite sequence of intervals [0,M„], where lirrin^ooMn = 2, what can 
be mathematically proved. In conclusion, even if the semi-algorithm does 
not terminate, it gives us indications on the behaviour of the water-tank 
level control. These indications can then be used to complete the proof. 
Of course, in many examples, the algorithm completes but such examples 
are less appropriate to be used for explaining and illustrating. 

6 Conclusion 

We have shown a method to find out the set of fair dynamics of a hybrid 
system, i.e. the evolution rates that make the behaviour of the system is 
correct. It consists in using Slope Parametric Linear Hybrid Automata 
(S.P.L.H.A) which are hybrid automata the variable speeds of which are 
parametric functions. For those systems, we have outlined the analysis, 
which is an analysis on parametric polyhedra, and we have depicted two 




262 



M. Adelaide and O. Roux 



main algorithms. The simplification algorithm finds the conditions on 
parameters that guarantee non emptiness of parametric polyhedra and it 
shortens the description of these polyhedra. It uses the Grobner bases and 
the Cylindrical Algebraic Decomposition which are two significant issues 
in algebraic geometry. Moreover, we have written an algorithm, based on 
the previous simplification procedure, in order to decide the inclusion of 
parametric polyhedra. 

The next important work we are involved in now is to try to improve 
the quantifier elimination algorithm for the particular systems we deal 
with. We expect that such improvements will give results useful to handle 
such parametric analyses for large systems. As a matter of fact, we plan 
to process the problem of dealing with several parameters, as we are up to 
now limited to only one parameter. As a study example, we are on the way 
to be able to synthesize the maximal drift in the Philips communication 
protocol. 



Acknowledgments 

The authors wish to thank Augusto Burgueho who had first the idea of 
using Grobner bases for our problem, and Frederic Boniol and Vlad Rusu 
who contributed to this parametric analysis work, at the beginning. 



References 



[ACD90] 

[ACH+95] 

[AD94] 

[Ade99] 

[AH94] 

[AHV93] 



R. Alur, C. Courcoubetis, and D. Dill. Model-checking for real-time 
systems. In Proc. of the 5th. Annual Symposium on Logic in Computer 
Science, LICS’90, pages 41-425. IEEE Computer Society Press, 1990. 
R. Alur, C. Courcoubetis, N. Halbwachs, T. A. Henzinger, P-H. Ho, 
X. Nicollin, A. Olivero, J. Sifakis, and S. Yovine. The algorithmic 
analysis of hybrid systems. Theoretical Computer Science, 138:3-34, 
1995. 

R. Alur and D. Dill. A theory of timed automata. Theoretical Computer 
Science, 126:183-235, 1994. 

M. Adelaide. Application des bases de grobner a I’analyse parametrique 
des systemes hybrides. Master’s thesis, Ecole Centrale de Nantes, au- 
gust 1999. 

R. Alur and T. A. Henzinger. Real-time system = discrete system 
-I- clock variables. In T. Rus and C. Rattray, editors, Theories and 
Experiences for Real-Time System Development - Papers presented at 
First AM AST Workshop on Real-Time System Development, pages 1- 
29, Iowa City, Iowa, 1994. World Scientific Publishing. Also available 
as Cornell University technical report CSD-TR-94-1403. 

R. Alur, T. A. Henzinger, and M. Y. Vardi. Parametric real-time rea- 
soning. In Proc. of the 25th Annual ACM Symposium on Theory of 
Computing, STOC’93, pages 592-601, 1993. 




[AMP95] 

[Bur98] 

[BBRR97] 

[BR97] 

[Buc] 

[BW93] 

[Coh96] 

[Col75] 

[DC092] 

[GVRRT95] 

[Hen96] 

[HH95] 

[MMP92] 

[Jir95] 

[Rod96] 

[Wan96] 



Using Cylindrical Algebraic Decomposition 263 



E. Asarin, O. Maler, and A. Pnueli. Reachability analysis of dynamical 
systems having piecewise-constant derivatives. Theoretical Computer 
Science, 138:35-65, 1995. 

A. Burgneno Arajona. Verification des Systemes Temporises par des 
Methodes d’Observation et d’Analyse Parametrique. PhD thesis, Ecole 
Nationale Superieure de I’Aeronautique et de I’Espace, jnne 1998. 

F. Boniol, A. Bnrguefio, O. Roux, and V. Rnsu. Analysis of slope- 
parametric hybrid antomata. In O. Maler, editor, Proc. of the Inter- 
national Workshop on Real time and Hybrid Systems, HART 97, pages 
75-80. Springer- Verlag, March 26-28 1997. 

A. Bnrguefio and V. Rusu. Task-system analysis using slope-parametric 
hybrid antomata. In Proc. of the Euro-Par’97 Workshop on Real- 
Time Systems and Constraints, Passau, Germany, August 26-29 1997. 
Springer- Verlag’s Lecture Notes in Computer Science series A^°1300. 

B. Bnchberger. Grobner Bases : an Algorithmic Method in Polynomial 
Ideal Theory, pages 184-232. Reidel. 

T. Becker and V. Weispfenning. Grobner Bases : A Computationnal 
Approach to Commutative Algebra. Springer Verlag, 1993. 

A.M. Cohen. Grobner base : a primer. Technical report, Compnter 
Algebra Information Network, Europe, 1996. 

George E. Collins. Qnantifier Elimination for Real Closed Fields by 
Cylindrical Algebraic Decomposition. In Proceedings of the 2nd GI 
Conference, volume 33 of Lecture Notes in Computer Science, pages 
134-183, Kaiserslautern, 1975. Springer, Berlin. 

J. Little D. Cox and D. O’Shea. Ideals, Varieties and Algorithms. Sprin- 
ger Verlag, 1992. 

Lanreano Gonzales- Vega, Fabrice Rouillier, Marie-Frangoise Roy, and 
Guadalnpe Trujillo. Some Tapas of Computer Algebra. Eindoven Uni- 
versity of Technology, a.m. cohen and h. cuypers and h. sterk edition, 
1995. 

T. Henziger. The theroie of hybrid auromata. In IEEE Symposium on 
Logic In Computer Science, pages 278-282, 1996. 

T.A. Henzinger and P.-H. Ho. HyTech: The Cornell Hybrid Techno- 
logy Tool. In P. Antsaklis, A. Nerode, W. Kohn, and S. Sastry, editors. 
Hybrid Systems II, Lectnre Notes in Computer Science 999, pages 265- 
293. Springer- Verlag, 1995. 

O. Maler, Z. Manna, and A. Pnueli. From timed to hybrid systems. 
In J. W. de Bakker, K. Huizing, W.-P. de Roever, and G. Rozenberg, 
editors, Proc. of the REX workshop ’Real- Time: theory in practice ’, vo- 
lume 600 of Lecture Notes in Computer Science, pages 447-484, Berlin, 
New York, 1992. Springer- Verlag. 

M. Jirstrand. Cylindrical algebraic decomposition - an introduction. 
Technical report. Computer Algebra Information Network, Europe, De- 
partement of Electrical Engineering, Linkoping university, S-581 83 
Linkoping. Sweden, October 1995. 

G. Roda. Quantifier elimination - lecture notes based on a course by g.e. 
Collins, rise- summer semester 96. Technical report. Computer Algebra 
Information Network, Europe, July 1996. 

Farn Wang. Parametric timing analysis for real-time systems. 
130(2): 131-150, 1 November 1996. 




Probabilistic Neighbourhood Logic 



Dimitar P. Guelev 



International Institute for Software Technology 
of the United Nations University 
(UNU/IIST), Macau, P.O.Box 3058. 
E-mail: dg@iist.unu.edu 



Abstract. This paper presents a probabilistic extension of Neighbour- 
hood Logic (AL,[14,1]). The study of such an extension is motivated by 
the need to supply the Probabilistic Duration Calculus {PDC, [10,4]) 
with a proof system. The relation between the new logic and PDC is 
similar to that between DC [15] and ITL [12,3]. We present a complete 
proof system for the new logic. 



Introduction 

The Probabilistic Duration Calculus (PDC) was introduced in [10] as an exten- 
sion of Duration Calculus[15]. The approach to introducing PDC is as follows: 
Consider some finite probabilistic timed automaton A. The behaviours of A can 
be represented as a set M of DC models. The probabilistic laws that govern the 
working of A are used to introduce probability on the subsets of M. Civen a 
DC formula D, the term n{D){t) denotes the probability of those models from 
M that satisfy D at the interval [0,t]. Terms of this sort are the component of 
PDC language that is new in PDC, relative to DC. In [10] the authors focused 
on the case of discrete time for the sake of simplicity. In a later work, [4], PDC 
was introduced for the case of real time too. 

Both papers present examples of specification by PDC and a number of valid 
PDC formulas, that represent basic properties of the probabilistic operator /i. A 
section on specification by PDC can be found in [11] too. However, no complete 
proof system for PDC has been proposed so far. 

DC is an extension of Interval Temporal Logic {ITL), and so is its proof 
system. ITL has a complete proof system with respect to an abstract class of 
frames [3]. In this paper, we introduce Probabilistic Neighbourhood Logic {PNL) 
by generalising the semantics of PDC. PNL is designed to take the role that 
ITL has for DC, yet for PDC. PNL is based on Neighbourhood Logic {NL, 
[14]), which is another interval-based temporal logic, closely related to ITL. 
Unlike ITL, NL has modal operators which allow reference to intervals outside 
the current one. This feature has proved useful for the axiomatisation of the 
probabilistic operator of PNL. NL has a proof system, that is complete with 
respect to an abstract semantics, which is similar to that of ITL[1]. 

In this paper we extend the proof system of NL to obtain a complete one 
for PNL for a similarly abstract semantics. Earlier versions of PNL have been 
studied by the author in [6,7] and by Vladimir Trifonov in [13]. 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 264-275, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




Probabilistic Neighbourhood Logic 265 



1 Preliminaries on Neighbourhood Logic 

Neighbourhood logic is a classical first order predicate logic with equality and 
two unary normal modal operators. 

1.1 Language 

A language of NL is determined by a countable set of individual variables x, 
y, . . . , and several other sets of symbols. These are constant symbols c, d, . . . , 
function symbols and relation symbols R, S, .... Symbols of every kind 

can be either rigid or flexible, depending on the way they are interpreted. 

Given the sets of symbols, the terms t and the formulas tp of the corresponding 
NL languages are defined by the BNFs: 
t ::= c\x\f{t, ...,t) 

(p ::= ±\R{t, . . -,t)\{(p ^ ip)\3xp>\Oiip\OrP> 

Function symbols and relation symbols are assigned arity to denote the num- 
ber of arguments they admit. Every NL language contains the rigid constant 
symbol 0, the rigid binary function symbol -I-, the rigid binary relation symbols 
= and < and the flexible constant 

Individual variables are regarded as rigid. Formulas and terms which contain 
no flexible symbols, are called rigid too. The set of individual variables that have 
free occurrences in a formula (p is denoted by FV{(p). 

1.2 Frames, Models and Satisfaction 

Definition 1. A NL time domain is a linearly ordered set. A NL duration 
domain is an algebraic system of the type which satisfies the 

axioms: 

(Dl) X + {y + z) = {x + y) + z (D6) x < x 

{D2) x + Q = X (A*7) x<yAy<x^x = y 

(D3) x + y = x + z^y = z {D8) x<yAy<z^x<z 

(DA) 3z{x + z = y) (7?9) x < y ^ 3z{x + z = y AQ < z) 

(D5) X + y = y + X 

We use < to denote both the ordering of time and duration domains. 

Definition 2. Given a time domain {T, <), the set of the closed intervals {[ri, T 2 ] : 
'TiiT 2 G T,ti < T 2 } in T is denoted by I(T). Given a time domain (T, <) and a 
duration domain {D,+,0,<), a measure function m is a surjective function of 
type 1(T) — >■ D, which satisfies the axioms: 

(Ml) m{a) = m{a') A mina = miner' =4> max cr = maxa' 

(M2) maxCTi = min 02 ^ rn{ai) + m{a 2 ) = m{a\ U 02 ) 

(M3) m(cr) = X + y ^ 3CT'(mintr' = miner A m{u') = x). 

Definition 3. A tuple of the kind {{T,<),{D,+,0,<),m), where (T,<) is a 
time domain, {D,+,Q,<) is a duration domain, and m is a measure from I(T) 
to D, is called NL frame. 




266 



Dimitar P. Guelev 



Clearly, if a measure function from a time domain (T, <) to a duration do- 
main (D,-|-,0, <) exists, {D,<) is isomorphic to (T, <). For this reason NL is 
usually regarded as having just duration domains in its frames. We keep the 
two components of NL frames distinct for the sake of compatibility with ITL 
semantics, where they may differ more. 

Let L be an 7VL language. 

Definition 4. Let F = ((T, <), (D, -I-, 0, <),m), where (T, <) he an NL frame. 
A function I which is defined on the set of symbols of L and satisfies the requi- 
rements: 

I{x),I{c) G D for individual variables x and rigid constants c 

I{f) G (D" — ^ D) for n-place rigid function symbols f 

I{R) G (D" — >■ {0, 1}) for n-place rigid relation symbols R 

I{c) G (I(T) — >■ D) for flexible constants c 

i{f) e (l(T) X D" — >■ D) for n-place flexible function symbols f 

m G (i(T) X D" — >■ {0, 1}) for n-place flexible relation symbols R 

/(O) = 0, /(-k) = -k, I{i) = m, /(<) is < and /(=) is = 

is called interpretation of L into F . 

Definition 5. A model for L is a tuple of the kind (F,I), where F is a frame, 
and I is an interpretation o/L into F. 

Given a frame F, we denote its components by (Tp, <f), {Dp, -kp. Op, <p) 
and mp, respectively. The same applies to models. We denote the frame and the 
interpretation of a model M by Fm and Im, respectively. 

Given a symbol s from L, interpretations / and J of L into frame F are said 
to s-agree, if I {s') = J{s') for L symbols s' other than s. 

Definition 6. Let M he a model for L. Let a G I(Tm). The values Ia{t) of 
terms t from L are defined as follows: 

Ia{x) = Im{x), = Im{c) for variables x and rigid constants c 

F{f{ti, . . ,,t„)) = lM{f){Itr{ti ), . . . , Irr{tn)) fov rigid u-placc function symbols f 
Ia{f{tl, . . . ,t„)) = lM{f){a, I<r{ti ), . . . , laitn)) fov flcxiblc u-placc function symbols f 
The relation M, cr |= for formulas (p from L is defined as follows: 

M,a^ L 

M,a \= R{ti, . . . ,tn) iff lM{R){Itr{ti), ■ ■ ■ ,F{tn)) = 1 fov rigid relation symbols R 

M,a \= R{ti, . . . ,tn) iff lM{R){a, Ia{ti), ■ ■ ■ , Ia{tn))= 1 for flexible relation symbols R 

M, (7 1= (yi ')/)) iff either M, a |= ip, or M, a ^ ip 

M, a |= 3xip iff {Fm, J), (J |= for some J that x-agrees with Im 

M, a \= Onp iff M, a' |= ip for some a' G I(Tm) such that maxcr' = miner 

M,a \= Or ip iff M, a' \= p for some a' G I(Tm) such that miner' = maxer 

1.3 Abbreviations 

Along with ordinary classical first order predicate logic abbreviations and infix 
notation, the following AL-specific abbreviations are used: 

^ ^ ^ e ~l = r and 

f = 1. 




Probabilistic Neighbourhood Logic 267 



The modal operator (.; .) of ITL is defined as an abbreviation in by putting: 
(V?; Ip) ^ 3x3y{x + y = ^^ 01{ip ^ i = x) ^ O^-ip M = y)), x,y ^ FV{{(p-, p;)). 

1.4 Proof System for NL 

The proof system of NL consists of axioms for classical first order predicate logic 
with equality, the axioms D1-D9 and the following axioms and rules: 

(Al) ^ Lp \i Lp \s rigid. (^4') Od3xip 3x<>d'L 

\A2) 0 < I ( 7 I 5 ) Od{^ = X A ip) ^ Od{i = X ^ if) 

(A3) 0 < a: ^ <>d{f- = x) (A6) O'Pip ^ 

(A4) Od{ip V -ip) ^ Odip V OdpJ (A7) e = X ^ {ip^ Op(i = x A (p)) 

(A8) 0<x^0<y^ Od(£ = x A Od(£ = y A Od(p)) Od(£ = x + y A Odip) 

(fi^ip ip ip ip ip 

{Mono) Odip ^ Odip (Nee) Lidp {MP) ip (G) \/xp 

Substitution ^jx\p of variable x by term t in formula p is allowed in proofs 
only if either t is rigid, or x does not occur in the scope of modal operators in p. 
This system is complete with respect to the above semantics)!]. 

2 Probabilistic Timed Automata: an Introductory 
Example to PNL 

Here we slightly generalise the notion of finite probabilistic timed automaton 
from [4]. 

Definition 1. A finite probabilistic timed automaton is a system of the kind 
A = {S, A, So, {D, +, 0, <), {qa, a G A), {Pa : a G A)), where 
S is a finite set 0 / states; 

A C {(s, s') : s, s' G S, s ^ s'} is a set 0 / transitions; 

Sq G S is called initial state; 

(D,+,0,<) is a duration domain; 

qa G [0, 1] is the choice probability for transition a G A; 

Pa G (D [0, 1]) is the duration distribution of transition a. 

Given A with its components named as above, As, denotes {s' G S' : (s, s') G 
A}, {qa : a G A) are required to satisfy ^ qa = ^ for Ag ^ {Pa '■ a G A), 

a^As 

are required to satisfy Pa{0) > 0, to he non-strictly monotonic and to converge 
towards 1. 

An automaton A of the above kind works by going through a finite or infinite 
sequence of states sq, si, ..., s„, ...such that (si,Si+i) G A for all i. Each 
transition has a duration, di. Thus, individual behaviours of A are recorded as 
sequences of the kind (ao, do), ■ ■ ■ (a„, d „), Oi G A, di G D, where the initial 
state of ao is sq, and every transition arrives at the initial state of the next one. 
Having arrived at state s, A chooses transition a G Ag with probability qa- The 
probability for its duration to be no bigger than d is Pa{d). 




268 



Dimitar P. Guelev 



Given a language L for NL with a 0-place temporal relation symbol a for 
every a € A, a, behaviour (ai,di), i = 0, 1, . . . can be represented as a model 
(F,I) for this language, where F = ((D, <), (D, -f, 0, <), Act. max cr — min ct) by 



putting I (a) (a) = 1 iff a = and ct = 



for some i. 



E di, E dt 

\J<i j<i 

Some properties of A behaviours can be straightforwardly expressed under 
this convention. For example, if a = (s, s') G A, 



OJia “■ ( V 

means that a behaviour which ends at a can only continue with a transition 
whose initial state is the final state of a, and 
a ^ 0) 

means that no transition can begin at some time point and end in two distinct 
time points. 

Now consider the set M of all the interpretations of L into F that represent 
behaviours of A in the above way. We need the following definition: 



Definition 2. Let t € Tp. We say that interpretations I and J o/L into F r- 
agree iff they coincide for rigid symbols from L, and coincide for flexible symbols 
from L on intervals a such that max a < r. 

The probabilistic components {qa ■ a G A) and {Pa : a G A) of A can be used to 
endow M with probabilistic structure as follows: 



For every t G D and every / G M a probability measure Pj^r is intro- 
duced on the subsets of M/ ^ = {/' G M : r-agrees with /}. Given 

N C M/^t, P/^t-(N) denotes the probability for A to continue a beha- 
viour that is described by / up to time t by one from N. 

Assume that, in addition to the temporal relation symbols a G A, L contains the 
rigid symbols qa and Pa, a G A, and they are interpreted by the corresponding 
components of A in all the interpretations from M. Assume that we introduce 
an operator p to L in the following way 
o If p is a formula, then p(p) is a term. 

o /„(p(p)) = Pi ,max a {{I G NI/ jYiax CT ■ ^ min ct {F , I ) , [min ct, t] [= 

In words, let p{(p) evaluate under I to the probability of the set of those inter- 
pretations of L which are continuations of / from time max ct on and satisfy (p 
at some interval starting at time min ct. 

Using p, the probabilistic law about A behaviours can be expressed as follows: 
a p{{a;bAi = x)) = qb-Pb{x), if a = (s, s') and b = (s',s") for some 
s,s',s" G S'; 

a ^ p{{a; b)) = 0 otherwise. 

To carry out this way of introducing p and its interpretation rigorously, we can 
consider models that consist of a NL frame F, a set of interpretations M of L 
into F, and a system of probability measures Pi,t, I G M,r G Tp, as specified 
as above. 




Probabilistic Neighbourhood Logic 269 



Having in mind that the values of p-terms are not necessarily similar to 
durations, F should contain a separate domain for probabilities too. Accordingly, 
languages that interpretations from M are defined on should have a sort for this 
domain. This is essentially what PNL models are. 



3 A Formal Definition of PNL 

3.1 Languages 

A PNL language is built starting from the same kinds of symbols as a NL 
language. PNL languages are two-sorted. Together with the well-known sort of 
durations, they have a sort of probabilities. Along with the arity, each non-logical 
symbol of a PNL language has a description of the sorts of each of its arguments, 
and of its value, in case it is an individual variable, a constant or a function 
symbol. For example, the function symbols Pa from automata-related languages 
take an argument of the duration sort to make a term of the probability sort. 
A PNL language should contain countably many individual variables of both 
sorts. Together with the symbols 0, -I-, =, < and £ of the sort of durations, PNL 
languages always contain the rigid constants 0 and 1, the rigid function symbol 
-b, and the rigid relation symbols < and = of the new sort of probabilities. Using 
the same notation for both probability and duration 0, -I- and = does not cause 
confusion. 

The BNF for formulas is as in NL. The BNF for terms in A^L languages is 
extended to capture the terms that express probability in PNL languages as 
follows: 

t ::= x\c\f{t, . . . ,t)\p{(fi,t, . . . ,t) 

Terms of the kind f{t, . . . ,t) are well-formed only if the sorts of the subterms t 
match the requirements for /. A similar condition applies to atomic formulas. 
Terms of the kind p{<p, t, . . . ,t) (p-terms) have the probability sort. They contain 
one formula-argument p and as many term arguments, as are the free variables 
of ip. Let x\,...,Xn be the free variables of p, listed in the order of their first 
free occurrences in p. Then p(p,ti, . . . ,tn) is well-formed, iff U has the sort of 
Xi, i = 1, . . . ,n. Besides p{p, xi, . . . , Xn) is abbreviated to p{p). This looks the 
same as for closed p, but is no source of confusion. We put 

n 

FV{p{p,h,...,tn)) = \jFV{U) 

i=l 



and 

[t/x]p{p, ti,...,tn) ^ p{p, [t/x]ti, ..., [t/x]tn). 

The symbol p is not a non-logical symbol in PNL languages. Its role is rather 
like that of modal operators, yet it is used to construct terms, not formulas. 




270 



Dimitar P. Guelev 



3.2 Frames, Models and Satisfaction 

In order to enable a finite complete first order proof system for PNL, we in- 
troduce probability domains in PNL abstractly, like the other NL domains: 



Definition 1. A system of the kind {U, is a probability domain, 

if it satisfies the axioms: 

\ui) X + {y + z) = {x + y) + z (C/5) x + y = Q^x = Q 
(C/2) X + 0 = X (C/6) 3z{x + z = y\/ y + z = x) 

\u‘i)x + y = y + x (C/7) 0^1 

{UA)x + y = x + z^y = z 

The classical probability domain is (R+, -I-, 0, 1). Another example is ({^ : 
i where n is a fixed positive integer. 

We assume that the linear ordering < which is defined by the equivalence 
X < y 3z{x + z = y) is available for probability domains. 

Definition 2. A tuple of the kind {{T,<),{D,+,0),{U,+,0,l),m) is a PNL 
frame, if {{T,<), {D,+,0),m) is an (ordinary, one-sorted) NL frame, and 
{U, -I-, 0, 1) is a probability domain. 

Interpretations of symbols from PNL languages into PNL frames are defined 
like in (one-sorted) NL languages. Of course, the types of the functions and 
relations that symbols evaluate to should match the types of the symbols. Be- 
sides, the obligatory symbols 0, 1, -k, and < of the probability sort should be 
interpreted by the corresponding components of the frame’s probability domain. 

The setting given in the previous section makes it clear that the values of 
the probability measures Pj t are relevant to the interpretation of p-terms only 
for some, formula- definable subsets of the set of interpretations that is part of 
every PNL model. These subsets are difficult to describe prior to defining the 
relation \= in corresponding model. On the other hand, requesting P/ to be 
defined on the entire powersets of interpretations would render the forthcoming 
completeness theorem unreasonably difficult to prove. 

That is why, before defining PNL models, we introduce an auxiliary notion 
of partial PNL models: 

Definition 3. Let L be a language for PNL. A triple {F, M, P) is a partial 
PNL model for L if F is a PNL frame, M is a set of interpretations of the 
non-logical symbols of L into F, and P = {Ppr ■ I G M, r G Tp) is a system 
o/ partial functions Pj^r G (2'^^’^ — >• Up), where M/_,- = {/' G M : r-agrees 

with /}, which satisfy the equalities 

P/,.(0) = O,_,r(Mp,) = l, Pp,(Ni)+Pp,(N2)=P/,.(NiUN2)+Pp.(NinN2). 

for whichever Ni,N 2 C M/_.r Pi,t is defined. 

In the above definition Pp^ are partial probability functions on the sets of 
interpretations M/^^- They take the abstract kind of probabilities we introduced 
as their values. 




Probabilistic Neighbourhood Logic 271 



We proceed to define the satisfaction relation \= on partial PNL models. In 
order to define ^ for formulas of the kind 3x(p, we need a technical definition: 

Definition 4. Given an interpretation I of language L into frame F and a non- 
logical symbol s from L, If stands for the interpretation ofL into F that s-agrees 
with I and interprets s as a. Given a set of interpretations N o/L into F, N“ 
is {If : I € N}. Given a partial function f \ 2^ ^ Up, the partial function 
ff : 2^? — >• Up is defined by putting /“(N“) = /(N), if /(N) is defined. If 
/(N) is undefined, then /“(N) is undefined too. Given a partial PNL model 
M={F,m,P), Mf is 

Obviously Mf is a partial PNL model, if M is one. We abbreviate (. . . Iff . . .)“" 
to Iff’L’sf- The same applies to models M. 

Values Icr{t) of terms t and the modelling relation \= are partially defined in 
PNL models by simultaneous induction on the length of terms and formulas. 
The clauses about the kinds of terms and formulas that are known from NL 
are as in NL\ Given a PNL model M = (P, M, P) and I G M, the clause for 
M, I, a 1= is the same as that for (P, I), a ^ (p. Each clause applies only if 
the entities on its right side are defined. The only clause which is subjected to a 
somewhat greater change is the one about existential formulas: 

M, 1,(7 \= 3xip iff there exists an a such that a G Dp, in case a; is a duration variable, 
and a G Uf, in case a; is a probability variable, and Mf,If,a |= ip 

The new, PfVP-specific clause is about p-terms. Given a well-formed p-term 
p(p, ti, . . . , t„), Ia{pW, 



Pj. max<r({T G M/_n 



3r > mina Me 









[mincr,r] |= p}). 



only if P/,max<T is defined for the given set. In case ip is closed, this definition 
simplifies to 

/o-(p(p)) = P/,max<j({/' G M/^maxrr : 3 t > miner M , l' , [min (J, t] ^ <p}). 

In words, given an interpretation / G M and an interval a, Icr(p{p)) repre- 
sents the probability of the set of those interpretations P G M which are like I 
up to the end of the interval a and satisfy p at some interval which has the same 
beginning as a. For a modelled system’s behaviour which is represented by I for 
the time until max a, this term can represent the probability for this behaviour 
to continue so that p eventually gets satisfied in the specified kind of interval. 

In the general case the operator p evaluates the above probability under the 
assumption that the free variables of p evaluate to the values which , . . . , 
have in the current interval a. 

Note that interpretations I' which maxcr-agree with the selected one I may 
happen to satisfy p at intervals [mina, t] where r < maxcr. In this case satis- 
faction of p may happen to be a simple consequence of max cr-agreeing with J, 
and no substantial probability evaluation is involved. For example 

(p;T) ^p(p) = 1 

is a valid PNL formula, if p is retrospective (see definition 1), that is, if p does 
not specify properties of interpretations beyond the end of the current interval. 




272 



Dimitar P. Guelev 



Having defined (partial) ^ on partial PNL models, we are ready to define 
PNL total models: 

Definition 5. A partial PNL model M is a (total) PNL model, if values of 
terms and satisfaction of formulas from the corresponding language are ever- 
ywhere defined in M . 

For the rest of the paper only total PNL models are considered. 

4 A Complete Proof System for PNL 

We need to specify a special class of PNL formulas, in order to introduce our 
proof system. 

4.1 Retrospective Formulas and Interpretations Which x-agree 

Definition 1. We call NL formulas that can be defined by the BNF 
if ::= ±\R{t, . . .,t)\^ip\{ip A p)\{p] ip)\Oiip\3xip 
retrospective. 

There is a close connection between retrospective formulas and interpretations 
that T-agree: 

Proposition 1. Let F be a frame and t € Tp. Let I and J be interpretations 
of h into F that r-agree. Let a G KTp) and maxcr < r. Then {F,I),a \= '•p iff 
{F, J),a \= for all retrospective ip from L. 

Since occurrences of modal operators can be removed from rigid formulas 
due to Al, rigid formulas share the properties of retrospective formulas. 

4.2 The System 

The proof system for PNL that we propose is an extension of that for NL with 
the axioms U1-U7 about probabilities, and the following axioms and rules: 
{P±)p{±) =0 (P+) p(y>) +p(V>) =p{(pV fj) -\-p{{(f,T) A{'ip-,T)) 

(Ft) p(T) = 1 (P=) xi=yiA...Ax„=yn -^p{(p,xi, ...,x„) =p{p,yi , . . . ,j/n) 

p {Oj'f Oix) p =p A 0) 

(Po) P => p(V’) < P{x) if P is retrospective, (P) {(p-,p{%li) = x) ^ p{{p\'tj})) = x 
Note that in the above axioms and rules terms like p{p) should be understood 
as abbreviations of the kind p{ip,x\, . . . ,x„), as stated in Subsection 3.1. This 
means that these axioms and rules have instanced with formulas that have free 
variables. Po and P can be applied only to theorems of PNL. Substitution in 
p-terms is allowed in proofs only if the substitute term is rigid. 

The soundness of the above system is established in the ordinary way. Given 
a PNL language L, we denote the set of all PNL theorems in L by PNL-p. 

Consistency and maximal consistency are defined for sets of formulas in a 
PNL language with respect to the above proof system in the ordinary way. We 
have the following completeness result about our proof system: 




Probabilistic Neighbourhood Logic 273 



Theorem 1. Let F be a set of formulas from a PNL language L. Then F is 
consistent iff there exists a model M = {F, M, P) for L, an interpretation / G M 
and an interval a G I(TV) such that M, I,a \= F. 

A proof of this theorem can be found in [7,9] . 

5 Chapman-Kolmogorov’s Equality for Composition in 
PNL 

The means to express sequential composition of (probabilistic) processes in PNL 
is the defined operator (.; .). In this section we extend the semantics of PNL 
and its proof system so that probabilities of formulas with (.; .) satisfy Chapman- 
Kolmogorov’s equality about sequential composition under reasonable assump- 
tions. 

Since this equality involves integration, probability domains are extended 
with multiplication, which is needed to define integration. Multiplication of pro- 
babilities is required to satisfy the axioms: 

(C/8) x.l = x (Ull) x.{y + z) = x.y + x.z 

(C/9) x.(y.z) = {x.y).z (C/12) x.y = x.z f\xfft)^y = z 
(C/10) x.y = y.x (C^13) x ffQ ^ 3y{x.y = z) 

We extend the proof system of PNL by the rules: 

T 7 ^ 0) 

(P) £ = 0 A p{ip A9 ^ pHTj V’)) < a:) = 1 p{{t 9; if)) < x.p{(p A 9) 

T ^ 0) 

(F) ^ = 0 A p{(p A9 ^ pHt', V’)) > a:) = 1 p{{p 9] if)) > x.p{(p A 9) 

For a formula (p to specify a step in some process, it is natural to expect that 
bpiVL T ^ “’(</>; £ 0). That is why the latter formula is used as a premiss for 

P and If. Let be a formula from some PNL language L. Let M = {F, M, P) 
be a model for L. Let (Jq G I(7f) be a 0-length interval and Iq G M. Let 
/ = A/.P/,t'(M// T hen the equality of Chapman-Kolmogorov can 
be expressed as: 

Plo.rC^Io.a.i^l'tP)) = J f{I)dPla,T- 

The integral which occurs above is defined as the least upper bound of the 

n 

sums of the kind ^ inf f{I)Pig^r{Ai), in case it is equal to the greatest lower 

j_i leAi 

n 

bound of the sums of the kind ^ sup f{I)Pig^r{Ai), where {Ai, . . . , A„} ranges 

i=iieAi 

over the finite partititions of for which Pj^^riAi), i = 1, . . . , n is defined. 

In order to enable this definition, we need to require that the linear ordering 
of Up is complete. Unfortunately, this cannot be enforced by first-order means. 
In the general case we can show that the above rules entail the following appro- 
ximation of Chapman-Kolmogorov’s equality for models M = {F, M, P) which 
validate them: 




274 



Dimitar P. Guelev 



Let Iq G M, and a G I(7f) be a 0-length interval. Let (p and be formulas, 
and (p satisfy the premiss of our rules. Let n < u>. Then there exists a partitition 
{Ai : i < n} of M/^ ^,(,3 such that (i — 1) < n.f{L) < i for L G A^, i = 0, . . . ,n, 

n n 

and moreover 'Y^{i— < n.P/g_T-(M/g {tp'.ip)) — S ’^■Plo.ri.Aj) . 

i=l i=0 

Clearly, in the case U = R+, this is equivalent to the precise equality. 



Conclusions 

We believe that, by introducing PNL and finding a complete proof system for 
it, we have made the task of obtaining a similar system for PDC a lot simpler. 
In fact, PNL has the expressive power of PDC, except for state expressions and 
their durations. However they can be introduced to PNL using the construc- 
tions presented in, e.g. [5]. This makes it reasonable to believe that PNL is an 
appropriate tool for the specification and verification of probabilistic behaviour 
of real-time systems. 



Acknowledgements 

Thanks are due to Dang Van Hung for his remarks and suggestions on draft 
versions of this paper. 



References 

[1] Barua, R. S. Roy and Zhou Chaochen. Completeness of Neighbour- 
hood Logic. Proceedings of STACS’99, Trier, Germany, LNGD 1563, Sprin- 
ger Verlag, 1999. 

[2] Chang, C. C. and H. J. Keisler. Model Theory. North Holland, Am- 
sterdam, 1973. 

[3] Dutertre, B. On First Order Interval Temporal Logic Report no. CSD- 
TR-94-3 Department of Computer Science, Royal Holloway, University of 
London, Egham, Surrey TW20 OEX, England, 1995. 

[4] Dang Van Hung and Zhou Chaochen. Probabilistic Duration Calculus 
for Continuous Time. Formal Aspects of Computing, 11, pp. 21-44, 1999. 

[5] Guelev, D. P. A Calculus of Durations on Abstract Domains: Comple- 
teness and Extensions. Technical Report 139, UNU/IIST, P.O.Box 3058, 
Macau, May 1998. 

[6] Guelev, D. P. Probabilistic Interval Temporal Logic, Technical Report 
144, UNU/IIST, P.O.Box 3058, Macau, August 1998, Draft. 

[7] Guelev, D. P. Probabilistic and Temporal Modal Logics, Ph.D. thesis, 
submitted. 

[8] Guelev, D. P. Interval-related Interpolation in Interval Temporal Logies, 
submitted, 1999. 

[9] Guelev, D. P. Probabilistic Neighbourhood Logic. Technical Report 196, 
UNU/IIST, P.O.Box 3058, Macau, April 2000. 




Probabilistic Neighbourhood Logic 275 



[10] Liu Zhiming, A. P. Ravn, E. V. Sgrensen and Zhou Chaochen. A 
Probabilistic Duration Calculus. In: Proceedings of the Second Interna- 
tional Workshop on Responsive Computer Systems. KDD Research and 
Development Laboratories, Saitama, Japan, 1992. 

[11] Mathai, J. Real-Time Systems. Prentice Hall, 1995. 

[12] Moszkowski, B. Temporal Logic For Multilevel Reasoning About Hard- 
ware. IEEE Computer, 18(2): 10-19, February 1985. 

[13] Trifonov, V. T. A completeness theorem for the probabilistic interval 
temporal logic with respect to its standard semantics, M.Sc. thesis, Sofia 
University, July 1999. (In Bulgarian) 

[14] Zhou Chaochen, and M. R. Hansen. An Adequate First Order Interval 
Logic. International Symposium, Compositionality - The Significant Dif- 
ference, H. Langmaack, A. Pnueli and W.-P. de Roever (eds.). Springer, 
1998. 

[15] Zhou Chaochen, C. A. R. Hoare and A. P. Ravn. A Calculus of 
Durations. Information Processing Letters, 40(5), pp. 269-276, 1991. 

Remark: UNU/IIST technical reports are available from URL 

http : //www. list .unu. edu/newrh/III/ 1/page .html 




An On-the-Fly Tableau Construction 
for a Real-Time Temporal Logic 



Marc Geilen and Dennis Dams 

Faculty of Electrical Engineering, Eindhoven University of Technology 
P.O.Box 513, 5600 MB Eindhoven, The Netherlands 
E-mail: {m . c . w . geilen , d . dams}@tue . nl 



Abstract. Temporal logic is a useful tool for specifying correctness pro- 
perties of reactive programs. In particular, real-time temporal logics have 
been developed for expressing quantitative timing aspects of systems. A 
tableau construction is an algorithm that translates a temporal logic for- 
mula into a finite-state automaton that accepts precisely all the models 
of the formula. It is a key ingredient to checking satisfiability of a for- 
mula as well as to the automata-theoretic approach to model checking. 
An improvement to the efficiency of tableau constructions has been the 
development of on-the-fly versions. In the real-time domain, tableau con- 
structions have been developed for various logics and their complexities 
have been studied. However, there has been considerably less work ai- 
med at improving and implementing them. In this paper, we present an 
on-the-fly tableau construction for a linear temporal logic with dense 
time, a fragment of Metric Interval Temporal Logic that is decidable in 
PSPACE. We have implemented a prototype of the algorithm and give 
experimental results. Being on-the-fly, our algorithm is expected to use 
less memory and to give smaller tableaux in many cases in practice than 
existing constructions. 



1 Introduction 

Temporal logic has enjoyed an increased interest ever since it was suggested, in 
[12], as a formalism for specifying correctness properties of reactive programs. 
Using this approach, several questions about a program’s specification can be 
phrased in formal terms. When the (abstraction of the) program is finite-state, 
model checking procedures can be used to verify correctness automatically. To- 
day, a rich variety of temporal logics exist in which properties of various kinds 
can be expressed. In particular, real-time temporal logics have been developed 
for expressing timing aspects of systems. 

A tableau construction is an algorithm that translates a temporal logic for- 
mula into a finite-state automaton (possibly on infinite words) that accepts pre- 
cisely all the models of the formula. Tableau constructions play a role in several 
places in the area of specification and verification. For example, it is the key to 
deciding satisfiability of formulas. Furthermore, the automata-theoretic approach 
to model checking ([11,14]) relies on tableau algorithms to turn a temporal for- 
mula into an observer of a program’s behaviours. Driven by practical needs, 

M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 276-290, 2000. 

© Springer- Verlag Berlin Heidelberg 2000 




An On-the-Fly Tableau Construction 277 



tableau constructions are being continuously improved and reimplemented (e.g. 
[8,5,6]). One such improvement has been the development of on-the-fly versions 
of tableau constructions. In general, this means that the tableau automaton is 
constructed in a lazy way, generating states and transitions as they are needed. 
At the heart of such on-the-fly tableau constructions is a normal form for tem- 
poral formulas in which the constraints on the current state are separated from 
the constraints on the future states. Another research topic is the search for frag- 
ments of temporal logics that reduce the complexity of tableau constructions, 
which is PSPACE or worse for many common logics. Furthermore, the limits of 
expressiveness of logics within certain complexity classes are explored ([4,13]). 

In the real-time domain, tableau constructions have been developed for va- 
rious logics and their complexities have been studied ([1,9,13]). The key obser- 
vation behind the tableau constructions in the case of a real-valued time domain 
is the observation that, under certain restrictions on the logic, the continuous 
time axis can be discretised. More precisely, given a real-time temporal logic 
formula time can be sliced into countably many intervals in such a way that 
along each interval, the truth value of any subformula of (p is invariant. There 
has been considerably less work aimed at improving and implementing tableau 
constructions for real-time temporal logics. 

In this paper, we present an on-the-fly tableau construction for a linear tem- 
poral logic with dense time. The logic that we consider is based on a fragment 
of Metric Interval Temporal Logic (Mitl, see [4]) that is decidable in PSPACE. 
To the best of our knowledge, our tableau construction is the first on-the-fly 
construction for this logic. Compared to the tableau construction for Mitl pre- 
sented in [1], our work is distinguished by the following points. Our tableau 
algorithm is on-the-fly. The technical underpinning of this construction is a ge- 
neralisation to the real-time case of the normal form procedure for formulas, 
that separates constraints on the current time interval from those on the rest 
of the time axis. In order to define this normal form, the logic is extended with 
timers that can be explicitly set and tested, and with a Next operator referring 
to the beginning of the next time interval. As expected, the correctness of the 
normal form procedure hinges on the discretisation result mentioned above. 

A technical complication of the tableau construction of [1] is caused by the 
two different forms that interval bounds may have: open or closed. As a result, 
the transition from an interval of time into the next, adjacent, interval can 
happen in two ways. In order to distinguish these, an additional clock in the 
timed tableau automaton is needed and furthermore, the type of transition must 
be remembered. We impose restrictions on the form of the models of formulas 
and restrict the timer constraints expressible in the logic, in such a way that 
it suffices to consider intervals that are left-closed and right open. Apart from 
saving a timer in the tableau automaton, this simplifies the presentation. We 
have implemented a prototype of the algorithm and give experimental results. 

Section 2 introduces the versions of logic and timed automata we use. Sec- 
tion 3 presents the normal form for formulas. The tableau algorithm, its correc- 
tness, and the implementation are the topic of Section 4. Section 5 concludes. 




278 



M. Geilen and D. Dams 



2 Preliminaries 

An {u}-)word w = crocricr 2 ■ ■ ■ over an alphabet A is an infinite sequence of 
symbols from A; uj(k) denotes at and refers to the tail ak<Jk+i<Jk +2 ■ • ■ • 
Indeed, we use the latter notations for other kinds of sequences as well. An 
interval I = [a,b) is a convex, left-closed, and right-open subset of M-°; /(/) 
(r(/)) denotes the lower (upper) bound a (b) and |/| the length of I. We use 
I — t to denote the interval {f — t \ t' € 1} . An interval sequence I = I 0 I 1 I 2 ■ ■ ■ 
is an infinite sequence of intervals that are adjacent, meaning r{Ii) = for 

every i, for which 1{Iq) = 0, and which is diverging, i.e. any t G belongs 
to some interval A. A timed word u over A is a pair {w,T) consisting of an 
w-word w over E and an interval sequence I. For t € M-°, u{t) denotes the 
symbol present at time t, this is w{k) if t G I{k)- For such a t and k, vd is the 
tail of the timed word, consisting of the word and of the interval sequence 
[0, r(/(fc)) — t), I{k+1) — t, . . . . Timed words Ml and U 2 are equivalent (denoted 
Ml = M 2 ) if for all t > 0, ui{t) = U 2 {t). A timed state sequence over a set Prop 
of propositions is a timed word over the alphabet 2^™^'. 

2.1 Real-time Temporal Togic 

We consider a restricted version of the real-time temporal logic Mitl of [4], 
Mitl< , with formulas of the following form (d G N) . 

ip :■= true \ p \ \ P 2 \ Pi^<dP 2 

Formulas of this form are called basic, in order to distinguish them from formulas 
using an extended syntax that is to be defined in Section 3.1. A formula is 
interpreted over a timed state sequence p as follows. 

— p \= true for every timed state sequence p; 

— p\=piSpG p(0); 

— p \= ~'P iff not p\= p-, 

— p 1= (^1 V v ?2 iff P h ‘di or p ^ (/? 2 ; 

— p \= Pi^<dP 2 iff there is some 0 < t < d, such that p* \= p 2 and for all 
0 <t' < t, p*' 1= pi. 

Note that no basic Mitl< formula can discriminate between equivalent timed 
state sequences. The restriction of the bound d to finite naturals simplifies the 
presentation as it yields a logic in which only safety properties can be expressed, 
thus avoiding the need to handle acceptance conditions. It is straightforward to 
extend our results to a logic which also includes an unbounded Until operator. 

2.2 <p-fine Interval Sequences 

In the on-the-fly tableau constructions of [5,8] for untimed logics, a formula is 
rewritten so that the constraints on the current state are separated from those 
on the remainder of the state sequence. This is possible because of the discrete 




An On-the-Fly Tableau Construction 279 



nature of the sequence. In the dense time case, there is no such thing as a next 
state. The discretisation suggested by the interval sequence / of a timed state 
sequence p is, in general, not fine enough for our purposes: when interpreted over 
tails of p, the truth value of a formula may vary along a single interval of I . 

Definition 1. Let ip € Mitl<. An interval sequence I is called ip-fme for timed 
state sequence p if for every syntactic subformula ip of p, every k > 0, and every 
ti,t 2 € I{k), we have p*^ \= iff H V'- case that I is p-fine for a timed 
state sequence (o’,!), also {a, I) will be called p-fine. 

In [I] (Lemma 4. 1 1) it was shown that the intervals of a timed state sequence 
can always be refined so that the value of a given Mitl formula does not change 
within any interval. Although the timed state sequences have a slightly more 
restrictive definition in our case, a similar lemma can be proved for the restricted 
logic Mitl<. 

Lemma 1. Let p be a basic Mitl< formula and p a timed state sequence. Then 
there exists a p-fine timed state sequence that is equivalent with p. 

Note that this lemma also implies that, when confining to i^-fine timed state 
sequences, it suffices to consider intervals, which are left-closed and right-open 
by our definition. In particular, there is no need to introduce a more general 
notion of interval, as in [ 1 ]. 

2.3 Timed Automata 

The target of our tableau construction are timed automata in the style of Alur 
and Dill ([3]). We use a variation that is adapted to our needs. The automata 
use timers that decrease as time advances, possibly taking negative values in R. 
They may be set to nonnegative integer values and may be compared to zero in 
a restricted way. Given a set T of timers, a timer valuation v G TVal{T) is a 
mapping T — >■ R. For t G R-°, v — t denotes the timer valuation that assigns 
v{x) — t to any timer x in the domain of v. A timer setting TS G TSet{T) is 
a partial mapping T — >• N. We use TS[x := d\ (where x £ T and d G N) to 
denote the timer setting that maps x to d and other timers to the same value as 
TS. [x := d] is short for 0[x := d]. For a timer valuation v and a timer setting 
TS, TS{v) is the timer valuation that maps any timer x in the domain of v to 
TS{x) if defined, and to ^{x) otherwise. The set TCond{T) of timer conditions 
over T is {cc > 0,x < 0 I x G T}. As suggested by Lemma 1 above, our timed 
automata may be restricted in such a way that the period of time during which 
control resides in a location is always a (left-closed, right-open) interval. 

Definition 2. Let S be an alphabet. A timed automaton A = {L, T, Lg, Q, TC, 
E) over E consists of 

— a finite set L of locations; 

— a set T of timers; 

— a finite set Lq of initial extended locations (^07 £ Lx TVal{T), where 

vq assigns integer values to the timers; 




280 



M. Geilen and D. Dams 



— a mapping Q : L — >■ 2^ labelling every loeation with a set of symbols from 
the alphabet; 

— a mapping TC : L — >■ labelling every loeation with a set of timer 

eonditions over T; 

— a set E C L X TSet{T) x L of edges labelled by timer settings. 

An extended location A is a pair {£, v) consisting of a location £ and a timer 
valuation v. In the context of an automaton with the set T of timers, we use 0 
to denote the timer valuation that maps every timer in T to 0. 

A timed run describes the path taken by the timed automaton when accepting 
a timed word. It gives the location of the automaton and the values of its timers 
at any moment, by recording the sequence of locations, the intervals during which 
the automaton resides in those locations, and the timer values at the beginning 
of each such interval. 

Definition 3. A timed run f of a timed automaton A = {L, T, Lq, Q, TC, E) is 
a triple (1, 1, v) consisting of a sequence of locations, an interval sequence, and 
a sequence of timer valuations, such that: 

— [Consecution] for all k>0, there is an edge TSk,i{k+ 1)) G E such 

that D{k + 1) = TSk{v{k) — \i{k) |); 

— [Timing] for all k > 0 and t € I{k), the timer valuation at time t, v{k) — 
ft — l{I{k))), satisfies all timer conditions in TC{l{k)). 

In this case we also say that r is a run from extended location (^(0), f'(O)), or an 
(^(0), i^(0))-run (or simply an £(0)-runj. We write rft) to denote the location of 
f at time t, i.e. £fk) ift € I{k). Civen a timed word u, r is a run for u if^ 

— [Symbol match] for all t >0, uft) G Q{f{t)). 

A accepts u if it has a run for u from an initial extended location. The timed 
language L{A) of A is the set of all timed words that it accepts. 

It follows from Definition 3 that the language of a timed automaton is closed 
under equivalence. Together with Lemma 1, this will allow us to restrict our 
attention to cp-fine sequences when arguing the correctness of the tableau con- 
struction for a basic Mitl< formula p. 

3 Disjunctive Temporal Normal Form 

Central to tableau constructions for temporal logics is the observation that every 
formula can be rewritten into a normal form in which the constraints on the 
current state are (syntactically) separated from the constraints on the future 
states. For example, in the case of Propositional Linear-time Temporal Logic 
(LTL), any formula can be written as a disjunction of terms of the form tt A Qp, 

^ As locations are labelled with sets of symbols, a single run corresponds in general 
to a set of timed words. 




An On-the-Fly Tableau Construction 281 



where tt is a conjunction of propositions dictating the constraints on the current 
state and the LTL formula (p is “guarded” by the Next operator 0> nieaning 
that it must hold in the future starting from the next state. In order to define 
a similar normal form for Mitl<, the logic needs to be extended by adding a 
Next operator and by introducing timers and timer settings into the syntax of 
the logic. 

3.1 Extending the Logic 

In order to define the normal form, the logic is extended with timers that can be 
bound by timer setting operators or which can be free. A formula is interpreted 
in a timer environment which provides a value for every free timer. Moreover we 
assume that formulas are in positive normal form (negations only occur in front 
of propositions) using the dual operators (false, A, V). The timer set operator can 
set timers to integer values, and timers can be compared to 0 only by checking the 
condition a: > 0 or x < 0. Compared to other timed logics that use freeze/reset 
quantifiers (see e.g. [2,10]), the syntax of our logic is restricted: The arguments 
of an Until or Release operator must be basic Mitl< formulas. 

Definition 4. Mitl< is redefined. Besides the basic formulas defined in Sec- 
tion 2.1 we add formulas of the following form, where p, pi, and (p 2 are basic 
formulas, d €N, TS is a timer setting, and x is a timer. 

::= (fi \ \/ 'tp2 \ i’l /\ i’2 \ TS.ip I a; > 0 I a: < 0 I ipiM <^dV2 I \ 

Vl^<xV2 I OV’- 

The semantics is extended as follows. We write p \=i, ip to denote that the timed 
state sequence p satisfies ip in the context of the timer valuation v. 
p\=uT iff p\= t; 

p V”! V ip2 iff p hi' hi or p hi' h2i 

phi' hi h 2 iff P hi' hi and p hi' h 2 i 

P hv ThJ <d'-P 2 iff for all 0 < t < d, p* hi'-t T 2 or there is some Q <t' <t, 
such that ff hi'-t' Tii 
p hi' TS.ip iff p hrsi,') h; 

P hv Ti^<xT 2 iff there is some 0 <t < v{x), such that p* hv-t T 2 and for 
all 0 <t' < t, ff' hv-t' Ti; 

P hv phJ <xP >2 iff for alio <t < v{x), p* hv-t P >2 or there is some 0 <t' < 
t, such that p* hv-t' Ti! 
p a; > 0 iff lyix) > 0; 

phv x<0 iff t^{x) h 0; 

P hv Oh iff l=i.-|/(o)| h where p = (ct,/). 

Note that with the Q operator it is now possible to discriminate between equi- 
valent timed state sequences. In our tableau construction for the timed case, 
formulas will also be rewritten to separate “now” from “next” . More precisely, 
the “now” part will refer to the current interval of time, and so it will be thought 
of as being true during a non-zero period. In contrast, the “next” part of a for- 
mula in normal form will refer to the first point of the next interval. In addition. 




282 



M. Geilen and D. Dams 



in rewriting a formula, we will also separate the setting of the timers to be effec- 
ted when entering the current interval. The rewrite rules achieving this will be 
presented in the next subsection. Of the equivalences on which they are based, 
the most important ones are displayed below. Here, = ip' abbreviates p \=n ip 
iff p \=i, Ip' , where p is a timed state sequence and v a timer valuation, that 
are both implicitly universally quantified. For some equivalences, conditions on 
p and u are required which are then explicitly listed. ip\ and p 2 are basic Mitl< 



formulas, d G N, and x and y are timers. 

y>iU<dy>2 = [j/ := d].(y>iU<y(p2) (1) 

y>iV<dy>2 = [j/ := d].(y>iV<j,y>2) (2) 

V>iU<^(p 2 = y >2 V (a; > 0 A y>i A 0(‘PiU<x‘P2)) (3) 

if p is both y>i-fine and y> 2 -fine, and ty(x) > 0 
= y>2 A (y>i V 0(‘PiV<d‘P2)) (4) 

if p is <piV<dy) 2 -fine 

<PiV< 2 ,(p 2 = a; < 0 V (y >2 A (pi V 0(v5iV<a;(p2))) (5) 

if p is both y>i-fine and y> 2 -fine 



Lemma 2. The equivalences between Mitl< formulas presented above hold. 

Proof. We will only prove the equivalence 4. Let p be a timed state sequence that 
is fine for ipi\/<dT 2 and v a timer valuation then p \=^, ipi\/<dT 2 iff P \=i> <P 2 A 
(<Pi V 0(‘PiV<d<P2)) (=^) Let p \=iy (pi\/<dT 2 - Obviously, p satisfies p 2 - Assume 
p '^v <Pi, then we need to show that p |=^ 0(‘PiV<d<P2)- Since p is (piV<d(p 2 -fine, 
if (piV<d(p 2 does not hold at the first moment of the second interval, then this is 
the first moment where it does not hold. Thus at the first moment of the second 
interval <piV<dip 2 holds. (<J=) obvious. 

One may wonder why the timer condition iy(x) > 0 in rule 3 is listed external 
to the formula, instead of making it part of the right-hand side: (x > 0 A P 2 ) V 
(x > 0 A (pi A 0(‘PiU<a:'F2))- The reason is that the truth value of a condition 
of the form x > 0 may change in a “left-open fashion”: it may be true in the 
first (singular) point of an interval, but false in the remainder. By syntactically 
restricting the timer conditions to the forms x > 0 and x < 0 only, it is clearer 
that the “now” parts of a formula in normal form can indeed be made true 
during a full interval. Indeed, the condition iy(x) > 0 will turn out to be always 
fulfilled when equivalence 3 is used as a rewrite rule in the tableau construction. 
This is also the reason for the introduction of the formula <dT 2 '- Unfolding 
the formula qyid<xT 2 would lead to a timer condition x < 0. 

3.2 Rewrite Rules 

Definition 5. An Mitl< formula is said to be in disjunctive temporal form if 
it is of the form V^=i TSi. (Ui A Q’Pi) where k > 0 (for k = 0 the formula 
equals false represented as the empty disjunction), the II i are conjunctions of 




An On-the-Fly Tableau Construction 283 



atomic propositions, negated atomic propositions and timer conditions, and the 
conjunctions of Mitl< formulas. 

For presentational purposes we introduce some notation. We will identify a set 
S' of formulas with the conjunction f\ W, the empty conjunction being equivalent 
to true. A term is a triple {TS, Now, Next), where TS is a timer setting and 
Now and Next are sets of Mitl< formulas. Such a term is identified with the 
Mitl< formula TS.{Now A QNext). A set of terms will be identified with the 
disjunction of the formulas represented by the individual terms. Note that an 
arbitrary Mitl< formula ip is represented by the set {(0, {ip}, 0)} of terms. We 
now introduce a number of rewrite rules on sets of terms, that transform any such 
set into disjunctive temporal form. These rules are presented in Figure 1, which 
is interpreted as follows. Consider a set U {{TS, Now U {ip}. Next)} of terms. 
The row in the table in which the Case field coincides with the shape of the 
Mitl< formula ip determines how the set is rewritten. For a timer setting TS, 
the function rename ts renames the timers occurring in any syntactical object 
apart from the timers in TS. The following lemma states that the disjunctive 



Case 


F U {{TS, Now U {ip}, Next)} reduces to: 


Ip = true 


<PU{{TS, Now, Next)} 


Ip = false 


$ 


Ip = ipi V 1p2 


$ U {{TS, Now U {ipi}, Next), {TS, Now U { 1 P 2 }, Next)} 


Ip = ipi A 1p2 


U {{TS, Now U {ipi,ip 2 }. Next)} 


Ip = (filU<d(p2 


$ U {{TS, Now U {[a; := d].{ifiiVi<x(fi 2 )} , Next)} 


Ip = y>iU<i,y>2 


<P U {{TS, Now U {(fi 2 V (a: > 0 A y>i A 0(‘PiU<a;y>2))}, Next)} 


Ip = y>iV<dy>2 


<T U {{TS, Now U {(fi 2 A (ipi V Oifii'd cd(fi 2 ))} , Next)} 


Ip = y>iV<dy>2 


$ U {{TS, Now U {[x ■.= d].{ifiiy ^x(fi 2 )} , Next)} 


Ip = (fil'd Cx<P2 


<P U {{TS, Now U {a: < 0 V (y >2 A (y>i V 0((fiiy <x(fi 2 )))}. Next)} 


II 


<PU [{TSU TS', Now Li {ip"}. Next)} 
where TS" .ip" = renamexsiTS' .ip') 


V 

O 

II 


$ U {{TS, Now, Next U {ip'})} 



Fig. 1. Rewrite rules for Mitl<. 



temporal form is a normal form under these rules. 

Lemma 3. Let ip be an Mitl< formula and ip' he obtained from ip by repeated 
application of rules from Figure 1 until no more rule applies. This process ter- 
minates, and Ip' is in disjunctive temporal form. Furthermore, for every timed 
state sequence p and every timer valuation v such that 

— p is fine for every basic syntactic subformula of ip; 

— v{x) > 0 for every timer x that is free in ip and that occurs in a subformula 
of the form ipiU<x'f’ 2 , 

we have p \=,, ip iff p \=i, ip' . 





284 



M. Geilen and D. Dams 



Proof. Termination and the fact that if' is in disjunctive temporal form are 
easily seen. The equivalence of if and if' under the stated conditions follows 
from Lemma 2 and a few distributivity rules for the Q timer set operators. 

Depending on the order in which terms from and formulas from Now are 
selected, different normal forms may be obtained. In the sequel, we assume the 
existence of a deterministic procedure NF that computes a particular normal 
form for any given formula. The following lemma states that the timer setting 
TS and Now and Next parts of a term in a normal form have the properties we 
set out for. It will be used in the correctness proof in Section 4.3. 

Lemma 4. Let if G Mitl<, v be a timer valuation, and p a timed state sequence 
with interval sequence I that is fine for all basic subformulas ofif. Ifp \=i, if, then 
there is some term {TS, Now, Next) G_NF{if) such that for all 0 < t < 1{I{1)), 
P* \=TS{i^)-t Now, and furthermore pd-f(i)) Next. 

Proof. Follows by induction on the number of rewrite steps in the normal form 
procedure. 

Example The (basic) Mitl< formula 0<2P = trueU< 2 _p has an equivalent for- 
mula in disjunctive temporal form: 

trueU< 2 P = ([x := 2].p) V {[x := 2]. (x > 0 A Q (trueU<a;p))) 

Here, the equivalence holds for any timer valuation v and any timed state se- 
quence p that is fine for p. In terms of the normal form procedure, the rewriting 
process of trueU< 2 P proceeds as follows (we write <1>2 to express that L >2 

is obtained from <Fi by one or more steps in the procedure). 

{( 0 , {trueU< 2 p} , 0)} ^ {([x := 2], {trueU<xp} , 0)} ^ 

{([x := 2], {p V (x > 0 A true A Q (trueU<a,p))} , 0)} 

{([x := 2],{p},0), ([x := 2],{x > 0, true, Q (trueU<a,p)} , 0)} ^ 

{([x := 2], {p} , 0), ([x := 2], {x > 0} , {trueU<;:cP})} 

4 Tableau Construction 

4.1 The Tableau Algorithm 

The construction of a tableau automaton for a basic Mitl< formula p, is ba- 
sed upon the normal form introduced in the previous section. The number of 
formulas that may occur in the Now and Next sets of the normal form terms 
is limited to syntactic subformulas of p and a number of formulas derived from 
them such as timer conditions and Until or Release formulas indexed by a timer. 
However, the procedure introduces new timers when applying the reduction for 
if = TS' .if' . If not applied carefully, this could lead to an unbounded number of 
timers and locations of the tableau automaton. To prevent this, we use a unique 
timer xp for every Until or release formula if occurring in p. It follows from the 




An On-the-Fly Tableau Construction 285 



normal form procedure that the only use of the timer of an Until (Release) will 
be in the (‘Pi'^<x^ 2 ) variant of its corresponding formula. Then we 

can apply the following equivalences to limit the number of timers. 

((piU<^(fi2) ([y '■= d].((piU<y(p2')') = ‘PiU<x‘fi2 if I2(x) < d (6) 

(<pi\/^^(p2) (Ip ■= d], ((fil'd ^y(p2')') = Ip '■= d]. ((fil'd ^y(f2) if i2(x) < d (7) 

The validity of these rules follows straightforwardly from the semantics. The 
use of the case ip = TS'.ip' can be circumvented by replacing the correspon- 
ding rules in the normal form procedure with the following new rules for the 
cases (fiiU<d(fi 2 and (fiid^d(fi 2 , based on the equivalences presented above. In the 
remainder, NF(il)) refers to the updated version of the normal form procedure. 



Case 


<P U {{TS, Now U {ip}, Next)} reduces TO; 


Ip = (filU<d(fi2 


<P U {(T5'[x^ := d\,Now U {(fiiy^<x,i,(fi 2 }, Next)} 
if {xy, > 0, (fiiyi<x,^(fi 2 } n Now = 0 


Ip = (filU<d(fi2 


<F U {{TS, Now U {(fiiyi<x,i,(fi 2 }, Next)} 
if {x,/, > 0, (fiiyi<x,i,(fi 2 } n Now yf 0 


Ip = (fii'd^d(fi2 


F\j{{TS[x.,p := d]. Now A {(fil'd .,^x,^(fi 2 } , Next)} 



The tableau automaton of an Mitl< formula ifi is computed in the following 
way. 

Definition 6. Let ifi he a basic Mitl< formula and Prop be the set of atomic 
propositions that occur in (fi. Then the tableau automaton A^p of (fi is the auto- 
maton (L, T,Lq,Q, TC,E) over the alphabet , where 

— T is the set of all timers x^ for everp Until or Release formula ip that occurs 
as a spntactic subformula of (fi. 

— The locations (L), initial extended locations (Lq) and transitions (E) are 
computed bp the procedure depicted in Figure 2. The locations £ € L are pairs 
(Now, Next) of sets. The first item of the location £ is denoted bp Now(£) 
and the last bp Next(£). 

— Q(£) = {a G 2^™^ I Vp^propP G Now(£) ^ p G a,~>p G Now(£) p ^ o}. 
That is, a location £ is labelled with all sets of propositions that are consistent 
with the atomic propositions and the negated atomic propositions in Now(£). 

— TC(£) = {^ G TCond(T) \ ^ G Now(£)}, the location is labelled with all 
timer conditions in Now(£). 



4.2 Example 

If we take the formula = falseV<ioo (trueU< 5 p) and apply the ta- 

bleau algorithm, we arrive at the automaton represented in Figure 3. Only the 
formulas in the Now set of the locations have been depicted. Initial extended 
locations are represented by a small arrow not originating from any location 
leading to the initial location and labelled with a timer setting that yields the 
initial timer valuation when applied to the timer valuation that assigns 0 to 
every timer. There is a timer associated with the formula 0<sp named x and a 
timer associated with the formula □<ioo^< 5 P named p. 





286 



M. Geilen and D. Dams 



Lo := {{{Now, Next) ,TS{0)) \ {TS , Now , Next) G NF{ip)} 

LNew := {{Now, Next) \ {{Now, Next) , TS) G Lo } 

L := 0, E := 0 
while LNew 0 do 

Let {Now, Next) € LNew 
LNew := LSew\{{Now, Next)} 

L := L U {{Now, Next)} 
for every {TS' , Now' , Next') ^ NF{Next) do 
E := E U {{{Now, Next) , TS' , {Now' , Next'))} 
if {Now' , Next') ^ L then LNew := LNew U {{Now' , Next')} 
od 
od 



Fig. 2. Algorithm for constructing the locations and edges of the on-the-fly tableau 
automaton. 



4.3 Correctness 

In this section we will give a sketch of the proof that the tableau construction 
is correct, i.e. that for any basic Mitl< formula ip, the tableau automaton of (p 
accepts precisely those timed state sequences that satisfy p, as expressed by the 
following theorem. 

Theorem 1. Let p be a basic Mitl< formula, the corresponding tableau 
automaton. Then for every timed state sequence p, accepts p iff p \= p. 

This theorem follows from soundness (every state sequence accepted by A,^ 
satisfies p) and completeness (every state sequence satisfying p is accepted by 
A,^) of the construction as expressed by Lemmas 7 and 10 below. The structure 
of the proof is similar to the correctness proof of the untimed on-the-fiy tableau 
of [8]. In this section, we assume that is a basic Mitl< formula and A,p = 
(L, T,Lq,Q, TC,E) its tableau automaton. 

Soundness In this paragraph it is demonstrated that the automaton accepts 
only timed state sequences that satisfy p. It is shown that whenever a formula 
leads to the normal form term that corresponds to a particular location 
then any state sequence for which there is an ti-run satisfies the formula ip. To 
do this, we associate with a location £ = {Now, Next), reached after performing 
the timer setting TS, the set Old{TS,£) as the set of all formulas for which 
{TS, Now, Next) is a term of the normal form. Note that in [8], a similar Old set 
is computed during the construction of the tableau and is part of a location. 

Definition 7. Let TS be a timer setting and £ = {Now, Next). 

Old{TS,£) = | {TS, Now, Next) G NF{{TS' , Now' , Next')) 

for some TS' , Next'} 





An On-the-Fly Tableau Construction 287 




Fig. 3. Example tableau automaton of the formula 



The main lemma is the following, claiming that any formula in the Old set of a 
particular location is dealt with correctly. 

Lemma 5. Let p be a timed state sequence and f = (/, 7, d) he a run of for p, 
taking the edges {i{k), TS k+i,({k + 1)). Furthermore, let TS he a timer setting 
such that F(0) = TS{v) for some timer valuation v and ip € Old(TS,£(0)). Then 

P hp(o) V'- 

Proof. By induction on the structure of ip. We only show the cases related to 
the Until formula. 

— If ipiU<d(fi 2 G Old{TS then it can be shown by the reduction of 
(pAi<dT 2 in the disjunctive temporal form procedure that 7 >iU<a ;(/?2 G 
Old{TS , £{Q)) . By induction it follows that p Hp(o) Pi^<x'p 2 - Since x cannot 
be larger than d in F(0)_, it follows that p Hp(o) Ti^<dT 2 - 

— If <Pi\}<xT 2 G Old{TS ,£{0)), then by the construction of the automaton and 

the reduction of <piU<x<P 2 in the normal form procedure, there is some k, such 
that (f 2 G Old{TSk~£{k)) and for every 0 < t < l{I{k)), (pi G Old{0,I{f)) 
and a; > 0 G Old{0, i{t)). Moreover it can be shown that the timer x is never 
set to d in any TSi, 0 < i < k, and thus l{I{k)) < D{0){x). By induction it 
follows that p l=p(o) Pi^<xT 2 - □ 

One can furthermore show that the Old set of every initial location contains the 
original formula (p. 

Lemma 6. For every initial extended location {£, v) of A,p, there exists some 
timer setting TS such that ip G Old{TS,£) and v = TS{0). 





288 



M. Geilen and D. Dams 



Proof. Follows straightforwardly from the definitions of the automaton and the 
disjunctive temporal form procedure. 

From the Lemmas 5 and 6, it follows immediately that every state sequence 
accepted by the tableau automaton satisfies 

Lemma 7. Let p he a timed state sequence. If accepts p, then p\= p. 



Completeness In this paragraph it is demonstrated that every timed state 
sequence that satisfies p is accepted by the tableau automaton. The normal 
form procedure guarantees that if a timed state sequence p satisfies a formula if, 
then there is a term in the normal form of if, that is satisfied by p. Moreover, it 
has been shown in Lemma 4 that the formulas in the Now set of the term hold 
during the entire first interval of p. Since the remainder of the state sequence 
satisfies the formulas in the Next set, there is a transition that can be taken 
by the automaton. This argument can be repeated to construct a run of the 
automaton for p. 

The following lemma states that if a timed state sequence is i^-fine and sa- 
tisfies all formulas in the Next set of a location, then there is an edge in the 
automaton to a new location where the first interval satisfies all Now formulas 
and the tail of the state sequence satisfies all Next formulas again. 

Lemma 8. Let i G L, p = {a,l) a p-fine timed state sequence, and v a timer 
valuation, such that p |=^ Next{£). Then there exists an edge {l,TS,i') G E 
such that for all 0 < t < 1{I{1)), we have that p* \=TS(t^)-t Now{t') and 
pHHA) Next{£'). 

Proof. Follows from Lemma 4 and the construction of the tableau automaton. 

Similarly we can use Lemma 4 to show that if a timed state sequence p satisfies 
p, then there is an appropriate initial extended location to start a run for p. 

Lemma 9. Let p = (d,/) he a p-fine timed state sequence such that p \= p. 
Then there is some {£,v) G Lq such that for allO <t < /(/(I)), p* \=,^-t Now{£) 
and phPP) Next{£). 

From Lemma 9 and repeatedly applying Lemma 8 to construct a run, it follows 
that A,p accepts all timed state sequences that satisfy p. 

Lemma 10. Let p he a timed state sequence. If p\= p, then A,p accepts p. 

4.4 Implementation 

In order to validate the tableau construction described in this section, a proto- 
type implementation of the algorithm has been made. Table 1 shows a few of the 
formulas that have been tested and the numbers of locations, edges and timers of 
the corresponding tableau automata. The size of the automata grows relatively 
mildly with the size of the formula. As we know of no other implementations, 
we have been unable to collect comparative results. 




An On-the-Fly Tableau Construction 289 



Table 1. Numbers of states, transitions and timers of the tableaux of different formulas 



Formula 


Numb, of states 


Numb, of transitions 


Numb, of timers 


-•0<5P 


4 


6 


1 


0<iooO<sp 


10 


22 


2 


<><5 (Q<ip V □<!(?) 


11 


21 


3 


pU<i (gU<i (rU<is)) 


14 


30 


3 


P (FI<5 (<j O^ir)) 


15 


48 


2 


(p => 0<5<j) U<iooO<5“ip 


21 


64 


3 


(((pU<4(?) U<3r) U<2s) U<it 


60 


271 


4 



5 Conclusions and Future Work 



We have presented an on-the-fly tableau construction for the fragment Mitl< of 
Mitl. Technically, this required the introduction of explicit timers and timer set 
operators into the logic, as well as a Next operator. Within this extended syn- 
tax, we could then define equivalence-preserving rewrite rules that syntactically 
separate any given formula into three parts: the timer settings, the constraints 
on the current time interval, and the constraints on the future intervals. The 
resulting normal form procedure is the first ingredient to the tableau construc- 
tion. The second ingredient is given by two more equivalences that are used to 
restrict the number of timers introduced in the tableau automaton to one timer 
per Until and per Release formula. 

In [4] it has been shown that the construction of tableaux for MitLq^oo, a 
slightly different fragment of Mitl, is PSPACE-complete, and this result can be 
adapted to the case of Mitl<. Thus, the theoretical worst-case complexity of 
our construction is the same as that of the construction in [4] . Being on-the-fly, 
we expect our algorithm to use less memory and to give smaller tableaux in 
many cases in practice. However, as we know of no other implementations, we 
have been unable to collect comparative results. 

By restricting the interval sequences of models to be left-closed and right- 
open, and subscripts of (basic) Until formulas to be of the form < d, we could 
confine ourselves to timed automata of a restricted form, so that in the tableaux 
no extra clocks are needed to distinguish between transitions from a right-open 
into a left-closed interval and transitions from a right-closed into a left-open 
interval. To demonstrate the construction, we have implemented the algorithm 
and shown experimental results for several formulas. We are currently extending 
the results of this paper to also include Until formulas without time bounds, 
leading to timed automata with acceptance conditions. We also consider an im- 
plementation of the presented algorithm in the context of the SHESim platform 
for specification and simulation ([7]). Another direction for future work is the 
generalisation of optimisations that have been developed for (on-the-fly) tableau 
constructions in the untimed case (see e.g. [5,6,8]). 





290 



M. Geilen and D. Dams 



References 

1. R. Alur. Techniques for automatic verification of real-time systems. PhD thesis, 
Stanford University, 1991. 

2. R. Alur, C. Conrcoubetis, and D. Dill. Model-checking in dense real-time. Infor- 
mation and Computation, 104:2-34, 1993. 

3. R. Alur and D.L. Dill. A theory of timed automata. Theoretical Computer Science, 
126:183-235, 1994. 

4. R. Alur, T. Feder, and T. Henzinger. The benefits of relaxing punctuality. Journal 
of the ACM, 43(1):116-146, January 1996. 

5. M. Daniele, F. Giunchiglia, and M. Y. Vardi. Improved automata generation for 
linear temporal logic. In N. Haibwachs and D. Peled, editors. Computer Aided 
Verification: 11th International Conference Proceedings, CAV’99, Trento, Italy, 
July 6-10, 1999 (LNCS 1633), pages 249-260. Springer, 1999. 

6. K. Etessami and G. Holzmann. Optimizing Biichi automata. To appear in Pro- 
ceedings of GONCUR’2000, 2000. 

7. M.C.W. Geilen and J.P.M. Voeten. Object-oriented modelling and specification 
using SHE. In R.C. Backhouse and J.C.M. Baeten, editors, Proceedings of the First 
International Symposium on Visual Formal Methods VFM’99, pages 16-24. Gom- 
puting Science Reports 99/08 Department of Mathematics and Computer Science, 
Eindhoven University of Technology, 1999. 

8. R. Gerth, D. Peled, M.Y. Vardi, and P. Wolper. Simple on-the-fly automatic verih- 
cation of linear temporal logic. In Proc. IFIP/WG6.1 Symp. Protocol Specification 
Testing and Verification (PSTV95), Warsaw Poland, pages 3-18. Chapman & Hall, 
June 1995. 

9. T. Henzinger. It’s about time: real-time logics reviewed. In D. Sangiorgi and 
R. de Simone, editors. Proceedings of the 9th International Conference on Concur- 
rency Theory (CONCUR 1998), pages 439-454, Berlin, 1998. Springer. 

10. T. Henzinger, X. Nicollin, J. Sifakis, and S. Yovine. Symbolic model checking for 
real-time systems. Information and Computation, lll(l):193-244, June 1994. 

11. O. Lichtenstein and A. Pnueli. Checking that finite state concurrent programs 
satisfy their linear specihcation. In Twelfth Annual ACM Symposium on Principles 
of Programming Languages, pages 97-107. ACM SIGACT/SIGPLAN, 1985. 

12. A. Pnueli. The temporal logic of programs. In Proc. of the 18th Annual Symposium 
on Foundations of Computer Science, pages 46-57. IEEE Computer Society Press, 
1977. 

13. J.-F. Raskin. Logics, automata and classical theories for deciding real time. PhD 
thesis, Facultes Universitaires Notre-Dame de la Paix, Namur (Belgium), June 
1999. 

14. M. Y. Vardi and P. Wolper. An automata-theoretic approach to automatic program 
verification (preliminary report). In Logic in Computer Science, pages 332-344. 
IEEE TC-MFC, IEEE Computer Society Press, 1986. 




Verifying Universal Properties 
of Parameterized Networks * 



Kai Baukus^, Yassine Lakhnech^, and Karsten Stahl^ 



^ Institute of Computer Science and Applied Mathematics 
CAU Kiel, Preusserstr. 1-9, D-24105 Kiel, Germany. 

{kba, kst}@inf ormatik.uni-kiel . de 
^ Verimag***, Centre Equation, 2 Av. de Vignate, 
38610 Gieres, France, lakhnech@imag.fr 



Abstract. We present a method for verifying universal properties of 
fair parameterized networks of finite processes, that is, properties of the 
form Vpi ■ ■ - Pn : tp, where isa. quantifier-free LTL formula. The starting 
point of our verification method is an encoding of the infinite family of 
networks by a single fair transition system whose variables are set (2nd- 
order) variables and transitions are described in WSIS, such a system 
is called a WSIS transition system. We abstract the WSIS system into 
a finite state system that can be model-checked. We present a generic 
abstraction relation for verifying universal properties as well as an algo- 
rithm for computing an abstract system. Since, the abstract system may 
contain infinite computations that have no corresponding fair computati- 
ons at the concrete level, the verification of progress property often fails. 
Therefore, we present methods that allow to synthesize fairness conditi- 
ons from the parameterized network and discuss under which conditions 
and how to lift fairness conditions of this network to fairness conditions 
on the abstract system. We implemented our methods in a tool, called 
PAX, and applied it to several examples. 



1 Introduction 

Problem statement and eontrihutions We present a method for verifying 
universal properties of fair parameterized networks of finite processes. 
In other words, we present a method for tackling the following problem: 
Given a parameterized network P\ || • • • || Pn, fairness eonditions, and 
a quantifier-free linear-time temporal property fiipi, ■ ■ ■ ,Pk), we want to 
prove Pi II • • • II Pn 1= Vpi, ...,Pk < n : V’(pi, • • • ,Pk), for every new, 
i.e., every fair eomputation of P\ || • • • || Pn satisfies Vpi, ■ ■ ■ ,Pk < n : 
fi{pi, . . . ,pk). 

* This work has been partially supported by the Esprit-LTR project Vires. 

* * * Verimag is a joint research laboratory of the University Joseph Eourier (Grenoble 
I), National Polytechnical Institute of Grenoble (INPG) and the National Genter of 
Scientific Research (GNRS). 



M. Joseph (Ed.): FTRTFT 2000, LNCS 1926, pp. 291-303, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




292 K. Baukus, Y. Lakhnech, and K. Stahl 

Our approach is verification by abstraction [CC77,CGL94,DGG94] 
and consists of the following steps: 

1. Representing the infinite family of fair networks Pi || • • • || as a single 
fair transition system S whose variables range over finite sub-sets of 
LV and whose transitions are expressed in WSIS, the weak second- 
order logic of one-successor [BiicGO]. We call such systems fair WSIS 
transition systems. 

2. Gonstructing an abstraction relation that maps the states of the WSIS 
transition system S to abstract states which are valuations of boolean 
variables. We present a generic abstraction relation for verifying uni- 
versal properties. 

3. Automatically constructing a finite abstract system Sa that is an 
abstraction of S which implies that every computation of S can be 
mapped to a computation of Sa- Moreover, we construct an abstract 
formula i[>a such that if Sa satisfies tf^A, then we can deduce Pi || • • • || 
Pn 1= Vpi, ...,Pk<n: ipipi, . . . ,pk), for every n G w. 

4. Since the abstract system is finite, we can use model-checking to verify 
that it satisfies ipA- However, verifying progress properties using ab- 
stractions often fails because of infinite computations in the abstract 
system that do not correspond to fair infinite ones in the concrete 
one. To mitigate this problem, we augment the abstract system with 
safe fairness conditions, that is, conditions that only remove infinite 
computations that do not correspond to concrete ones. We present 
two techniques for synthesizing fairness conditions from the concrete 
system that can be safely added to the abstract one: 

a) An algorithm that given a WSIS formula characterizing a ranking 
function computes pairs of sets of transitions expressing strong fai- 
rness conditions that are guaranteed to hold for the parameterized 
network, and hence, abstractions of them can be safely added at 
the abstract level. 

b) A method that allows to generate fairness conditions at the ab- 
stract level from the fairness conditions of the parameterized net- 
work. In particular, we discuss which kind of weak/strong fairness 
can be lifted from the concrete to the abstract level. 

We implemented our method in a tool, we call pax that uses the de- 
cision procedures of Mona [HJJ+96] to check the satisfiability of WSIS 
formulae. We then applied our tool and method to several examples in- 
cluding Dijkstra’s and Szymanski’s mutual exclusion algorithms. 



^ http:/ /www.informatik.uni-kiel.de/'kba/pax 




Verifying Universal Properties of Parameterized Networks 293 



Relevance and related work There has been recently much interest in 
the automatic and semi-automatic verification of parameterized networks. 
The methods presented in [GS92,EN96] show that for restricted classes of 
ring networks of arbitrary size, there exists k such that the verification of 
the parameterized network can be reduced to the verification of networks 
of size up to k. Alternative methods presented in [KM89,WL89,BCG89] 
are based on induction on the number of processes. These methods require 
finding a network invariant that abstracts any arbitrary number of proces- 
ses with respect to a pre-order that preserves the property to be verified. 
While this method has been originally presented for linear networks, it 
has been generalized in [GGJ95] to networks generated by context-free 
grammars. In [GGJ95], abstract transition systems were used to specify 
the invariant. An abstract transition system consists of abstract states 
specified by regular expressions and transitions between abstract states. 
The idea of representing sets of states of parameterized networks by re- 
gular languages is applied in [KMM"’'97] , where additionally finite-state 
transducers are used to compute predecessors. These ideas are applied 
to linear networks as well as to processes arranged in a tree architecture 
and semi-automatic symbolic backward analysis methods for solving the 
reachability problem are given. The work in [ABJN99,JN00] extends the 
ideas in [KMM"’'97] by considering the effect of applying infinitely often 
a transition that satisfies certain restrictions. In [BBLSOO], we presented 
a method based on abstraction for verifying invariance and a restricted 
class of liveness properties of parameterized networks. The method we 
present here allows us to deal with a larger class of progress properties. 



2 Preliminaries 

In this section we briefly recall the definition of weak second order theory 
of one successor (WSIS for short) [Biic60,Tho90]. 

Terms of WSIS are built up from the constant 0 and Ist-order varia- 
bles by applying the successor function suc(t) (“t -|- 1”). Atomic formulae 
are of the form b, t = t' , t < t', t G A, where 6 is a boolean variable, t and 
t' are terms, and A is a set variable (2nd-order variable). WSlS-formulae 
are built up from atomic formulae by applying the boolean connectives 
as well as quantification over both Ist-order and 2nd-order variables. 

WSlS-formulae are interpreted in models that assign finite sub-sets 
of u) to 2nd-order variables and elements of u) to Ist-order variables. The 
interpretation is defined in the usual way. 




294 



K. Baukus, Y. Lakhnech, and K. Stahl 



Given a WSIS formula /, we denote by [/] the set of models of /. 
The set of free variables in / is denoted by free{f). 

In addition to the usual abbreviations, given a 2nd-order variable 
P, we write 'ipi: f instead of Vi : i G P ^ / and 3pi : / instead of 
3i:i G P A /. 

Finally, we recall that by Biichi [BiicGO] and Elgot [Elg61] the satis- 
fiability problem for WSIS is decidable. Indeed, the set of all models of a 
WSlS-formula is representable by a finite automaton (see, e.g., [Tho90]). 

3 Parameterized Networks as WSIS Transition Systems 

We introduce WSIS transition systems which are transition systems with 
variables ranging over finite sub-sets of u) and show how they can be used 
to represent parameterized networks. In order to simulate the behavior of 
parameterized networks with fairness conditions we also need the notion 
of fairness for WSIS transition systems. 

Definition 1 (Fair WSIS Transition Systems). 

A fair WSIS transition system S = {V,0,T,J,C) is given by the follo- 
wing eomponents: 

— V = {Xi, . . . ,Xk}: A finite set of seeond order variables Xi ranging 
over finite sets of natural numbers. 

— 0: A WSIS formula with free{0) C V deseribing the initial eondition 
of the system. 

— T: A finite set of transitions where eaeh t ^ T is represented as a 
WSIS formula pr{V,V'), i.e., free{pr) C V U V'. 

— J': A set of pairs of seeond order variables expressing a weak fairness 
eondition. Eaeh pair {Xm,Xm') requires, for eaeh i G co, the weak 
fairness eondition that i eannot be eontinuously in Xm without being 
eventually in Xm', that is, Vi G a; : (OD(i G Xm) nO(i G Xm')). 

— C: A set of pairs {Xm,Xm') of seeond order variables expressing the 
strong fairness eondition Vi G w : (□0(i G Xm) — ^ FIO(i G Xm')). FI 

A state s of 5 is a mapping from the variables in V into finite sub- 
sets of oj. A eomputation of 5 is a sequence of states such that 

0[so(V)/V] and \J .,.^-YT[si{y), SiJ,-i{V) /V ,V] are valid formulae. A com- 
putation satisfies a weak fairness condition {Xm,Xm') G ^ iff the 

following condition holds for every x G to: 

if3i G io.Mj > i : X G Sj(Xm), then there exist infinitely many i’s sueh 
that x G Si(Xm'). 




Verifying Universal Properties of Parameterized Networks 295 



The computation satisfies the strong fairness condition {Xm, Xm') G C iff 
the following condition holds for every x £ to: 

if there exist infinitely many i ’s sueh that x G Si{Xm), then there exist 
infinitely many i’s sueh that x G Si{Xmi). 

Then, a fair eomputation of 5 is a computation that satisfies all fai- 
rness conditions in J' and C. Henceforth, we denote the set of fair com- 
putations of S by [5] . 

As a running example we use a simple mutual exclusion algorithm to 
illustrate how to represent a parameterized network as a WSIS system 
and how to analyze it. 

Example 1 . The parameterized network consists of processes where each 
process is described as follows: 



(V„ j : -^turuj V at_£o[i]) 

A turn'^ = true 

A (Vn j : j ^ i => (turuj = false 

A Aj=o,i .2 = at_dj,[i])) 




The transition from io to ii is weak fair whereas the loop from to (.i 
is strong fair. Initially, all processes are at (.q. Location (.2 represents the 
critical section. 

It is easy to see how each process Pi can be described using a boolean 
variable dXXm[i] for each control point (.rn[i]- 

We will verify that the algorithm satisfies the mutual exclusion pro- 
perty as well as the universal property that each process p reaches its 
critical section infinitely often, i.e., VnP : □OatT2[p]. 

To represent this network as a WSIS system we introduce three set 
variables AtTo, AtTi, AtT2 corresponding to the control locations, the 
set variable Turn corresponding to turn, and a set variable P repre- 
senting the set of processes part of the network. Moreover, we need two 
additional set variables Ej- and for each transition to express the fai- 
rness conditions, a process index will be member of these sets whenever 
the corresponding transition is enabled (resp. just taken) for this process. 
Let V denote this set of variables. If we denote by tq the self- loop in £1, 
then C = {(E-j-gjTj-^)}. The liveness property we will check later can then 
be expressed by Vpp : nO(p G At_I'2). For the sake of illustration, we 




296 



K. Baukus, Y. Lakhnech, and K. Stahl 



show the representation of tq: 

3p i : i G At_£i /\ {^pj ■ j ^ Turn V j G At_£o) A i G AtA'j^ A i G Turn' 

(j 0 Turn' A Aj=o,i ,2 0’ ^ At_4 ^ J G At_4))) 
AP = P' A B' TP' A TA = 0 A = {i} 

A ArerK = {i^P\ 3F" : ^")} • 

The existential quantification corresponds to an interleaving semantics 
where only one process proceeds in one step. Of course, it is also possible 
to model synchronous systems by using universal quantification. □ 

Note that the class of systems we can model as WSIS systems is 
restricted such that each process has to be finite state and the transiti- 
ons can be characterized in WSIS. The definition of a class that can be 
modeled as WSIS system and the translation can be found in the full 
paper [BLSOO]. 

4 Abstracting WSIS Systems 

In Section 3, we have shown how we model parameterized networks as fair 
WSIS systems. An infinite family of systems is represented by a single, 
though infinite-state, transition system. In the following, we present a me- 
thod to construct a finite abstraction of a given WSIS systems. Then, we 
show in Section 5 how the obtained abstract system can be enriched with 
fairness conditions such that interesting progress properties of the WSIS 
system can be verified. Let us first define what we mean by universal 
temporal properties. 

Let 77 be a countable set of process indices p that range over natural 
numbers and let S’ be a countable set of variables X that range over 
finite sets of natural numbers. We do not write the universal quantifier 
as free variables are understood as universally quantified. The set LTL of 
linear-time temporal properties over 77 and S is defined as follows: 

ip ::= iGio\pGX \ \ p Ap \ Q)p \ pUp, where i is a constant in uj. 

As usual, we use that temporal modalities □ (always) and O (eventually) 
which can be introduced as abbreviations. 

Formulae in LTL are interpreted over infinite sequences of structures 
of the form (X, X'), where X maps each variable A G X to a finite sub-set 
of u) and X' maps each variable in 77 to an element of uj. The definition 
of the interpretation of LTL is not given here as it is standard. 




Verifying Universal Properties of Parameterized Networks 297 



Let ip be an LTL formula with {X\, . . . , X^} as free set-variables and 
{pi, . . . ,pn} as free Ist-order variables. Moreover, let 5 be a WSIS tran- 
sition system with {X±, . . . ,Xf^} as variables. A computation (sj)igaj sa- 
tisfies p iff for every injective mapping X' from {pi, . . . ,pn} into lv, the 
sequence satisfies p. In other words, the computation 

satisfies p, if it satisfies all the temporal formulae obtained by instantia- 
ting the variables pi, ■ ■ ■ ^Pn- We say that S satisfies p, denoted by 5 |= (/?, 
if every fair computation of S satisfies p. 

A temporal property is called universal^ if it can be described by a 
formula in LTL. For instance, mutual-exclusion of Szymanski’s algorithm 
can be described by the formula □-■(pi G AtTy Ap 2 G AtTy), which is an 
universal temporal property. However, the communal liveness property 
stating whenever some process is in AtT 2 , eventually some process (not 
necessarily the same) reaches At Ay is not an universal temporal property. 
On the other hand, the stronger liveness property stating that every pro- 
cess in AtT 2 eventually reaches AtTy is an universal property as it can 
be described by the property n(p G AtT 2 ^ ^p G At Ay). 

The problem we are interested in is given a WSIS system S and given 
an universal temporal formula p to show 5 |= (/?. 

Abstractions and fair abstractions Given a deadlock-free^ transition sy- 
stem S = (V, 0,T) and a total abstraction relation a C A x Xa, we 
say that Sa = {Va,&a,Ta) is an abstraction of S w.r.t. a, denoted by 
<S Ea Sa, if the following conditions are satisfied: (1) so |= O implies 
«(so) 1= G>a and (2) r o a~^ C a~^ o ta- 

In case Xa is finite, we call ol finite abstraction relation. Let p^ pA 
be LTL formulae and let |(/9] (resp. denote the set of models of 

p (resp. PA). Then, from S Ea Sa, a~^{lpAj) E [eL and Sa \= PA 
we can conclude S \= p (here we identify a~^ and its point- wise lifting 
to sequences). In case 5 is a fair TS with T as fairness formula and if 
Ta is the fairness formula of then by requiring a“^([-iXA]) E 
we have the same preservation result as above. We indicate this type of 
abstraction by S Eq Sa- 

Next, we explain the main steps of our approach for verifying universal 
temporal properties before presenting each step in more detail. 

Approach Let S = iy,0,T,J,C) be a fair WSIS system modeling a 
parameterized network and let if be an universal temporal formula with 

^ Throughout this paper we only consider deadlock free transition systems which can 
be achieved by adding an idle transition. 




298 



K. Baukus, Y. Lakhnech, and K. Stahl 



{Xi, . . . , Xfc}U{pi, . . . ,Pm} as free variables. To simplify the presentation 
assume that m = 1 and write p instead of p\. Moreover, we denote by 
ip{i) the formula obtained from il) by replacing p by the constant i £ to. 

For each i £ ui, we construct a finite abstraction relation ai which 
maps states of S to abstract states. The abstract state space defined by 
ai is such that it contains for each sub-formula i £ X of a boolean 

variable each Xj an abstract variable bj. Then, ai relates 

a concrete state s to an abstract state iff s^{b\) 4^ i £ s(X) and 
s^{bj) s{Xj) / 0. Henceforth, let Oi be a predicate defining a*. Clearly, 
the abstract state spaces defined by ai and aj are the same modulo 
renaming of the variables b^- 

Then, for each i £ uj, one can effectively construct a finite abstract 
system S\ and an LTL-formula such that S\ |= implies S \= 
One can even effectively construct the set {5^ | i G w} of abstract 
systems. However, although this set is finite, it is computationally costly 
to construct. Therefore, we present an algorithm for constructing a single 
finite abstract system Sa which is itself an abstraction of each S\ and, 
as we show, is an abstraction of S. Moreover, we show how to construct 
an LTL-formula if a such that Sa \= V’A implies S \= f. 

Abstraction relation ai The set V\ of abstract variables consists of boo- 
lean variables. For each set X in the WSIS system S we have an abstract 
boolean variable b\ ^ '^a corresponding to i £ X. Thus, in particular 
we have the variable 6^^ and bf^, for each t £l~. Additionally, for each 
strong fairness condition {Er,T-j-) £ C we introduce boolean variables 
er,tr such that a* implies: 



Cr = ^pj:j£ Er 
tr = 3pj:j£Tr . 



For all other global state properties ip that may influence the progress of 
a certain process p another variable is added for which the abstraction 
is given by (p. This includes an adequate abstraction of the used natural 
numbers to express their influence on the behavior of the system. 

Henceforth, we also use Oi{V' ,V'X) to denote the predicate obtained 
from Sj by substituting the unprimed variables with their primed versions. 

Construction of Sa As mentioned it is costly to compute {S\ \ i £ 
uj} explicitly. Therefore, we show how one can construct a system that 
abstracts each of the elements of this set, and hence, by transitivity of 




Verifying Universal Properties of Parameterized Networks 299 



C abstracts S. The set Va of abstract variables of Sa contains for each 
abstract variable 6^ G a variable bx- 

We define the transitions of Sa by the following WSIS formula: 

3pp : 3V, V' : Sp(V, Va) A pr{V, V') A Sp(V', V^i) . 

Thus, we make sure that the choice of p in the concretizations of the 
source and target states of an abstract transition is the same. We can 
then show the following: 

Proposition 1. 5 a is an abstraction of S, i.e., S Cq, Sa with a = 3i G 

ujMi[bx/b^x]- 

Notice that the formulae above are WSIS formulae, and hence, by Biichi 
and Elgot’s result, the sets of numbers satisfying these formulae can be 
characterized by finite automata. We use Mona [HJJ+96] to construct 
these automata. 

5 Fair Abstractions 

It is well known that an obstacle to the verification of liveness properties 
using abstraction, is that often the abstract system contains cycles that 
do not correspond to fair computations of the concrete system. A way 
to overcome this difficulty is to enrich the abstract system with fairn- 
ess conditions or more generally ranking functions over well-founded sets 
that eliminate undesirable computations. We present a marking algorithm 
that given a reachability state graph of an abstraction of a WSIS system 
enriches the graph with strong fairness conditions while preserving the 
property that to each concrete computation corresponds an abstract fair 
one. The enriched graph is used to prove liveness properties of the WSIS 
systems, and consequently, of the parameterized network. Moreover, we 
discuss under which requirements the fairness conditions of the paramete- 
rized system can be lifted to the finite abstract one. In particular, we show 
that by requiring some conditions on the abstraction relation, it is sound 
to lift strong fairness. Weak fairness can only be lifted for a distinguished 
process. 

Throughout this section, we fix a WSIS system S = {V,0,T,ff,C) 
modeling a parameterized network and an abstraction relation a con- 
structed as explained in Section 4. Then, let Sa = {Va,0a,Ta) be the 
finite abstract system (without fairness) obtained by the method intro- 
duced in Section 4. We show how to add fairness conditions to Sa leading 
to a fair abstract system S^ = (Va, 0a,Ta, Ja,Ca) such that S S^. 




300 



K. Baukus, Y. Lakhnech, and K. Stahl 



Marking algorithm We use WSIS formulae to express ranking functions. 
Let be a predicate with i as free Ist-order variable and 

Xi, ■ ■ ■ , Xi^ E V as free 2nd-order variables. Given a state s of S, i.e., 
a valuation of the variables in V, the ranking value C(s) associated to s 
by ( is the cardinality of {i G w | xih • • • > s{^k))}- The marking 

algorithm we present labels each abstract transition of the abstract system 
with one of the symbols {+x; ~xi- Intuitively, an abstract transition 
TA is labeled by — if it is guaranteed that the concrete transition r 
associated with ta decreases the ranking value, i.e., {s,s') G r implies 
C(s) > C('sO- The label denotes that r increases the value for some 
concrete state. In the other cases we label ta with =^. 

Input: WSIS system S = (V,0,T), abstraction Sa = {Va,&a,Ta), set 
of predicates x(b -Ti, • • • , X^) 

Output: Labeling of Ta 

Description: For each Xi, - • • , Xi~), for each edge ta G Ta-, let r be 
the concrete transition in T corresponding to ta- 
Mark ta with — ^ if the following formula is valid: 

VV, V' : S(V, Va) a a(V', v;i) A Pr{v, V') ^ {i I x'(i)} C {i I x(i)} • 
Mark ta with if 

3V, V' : a(V, Va) a a(V', V^i) a Pr{v, V') ^ {i I x'(i)} 3 {i I x{i)} 

is valid. Otherwise, label the transition with 

Now, for a set formula x we denote with T-^ the set of edges labeled with 
+^. Then, we add for each such x and each transition ta labeled with — ^ 
the fairness condition {ta-,T^) which states that ta can only be taken 
infinitely often when one of the transitions in T-^ are taken infinitely 
often. 

Lifting fairness Recall that by definition of a (see Section 4), we introduce 
the abstract variables Cr = 3pi : i G FIt- and tr ^ ^pi '- i G T- We 
now argue that it is safe to augment 5 a with the strong fairness Ca = 
{{erTr) I {Et-,Tt) G C}, i.e., if is infinitely often true, then also tr is 
infinitely often true. Consider a computation where Cr is infinitely often 
true, that is, 3p i : i G Fi,- is infinitely often true. Now, each instance of 
the parameterized system only contains a bounded number of processes, 
hence, by Konig’s lemma, there must exists some i such that i G E^ 
infinitely often in this computation. Therefore, by the strong fairness 
condition of the concrete system, we must have i G T infinitely often, 
and hence, the computation satisfies □0(3pi : i G T)- Consequently: 




Verifying Universal Properties of Parameterized Networks 301 



Lemma 1. Under the assumptions above we have S Sa- 

The reasoning above does not hold for weak fairness. Indeed, OOcj- 
may hold for a computation without the existence of an i with OD(i G 
Er). 

Recall also that as explained in Section 4, we introduce for each tran- 
sition of the distinguished process p abstract variables bE^ and br^ ex- 
pressing whether the transition is enabled, respectively, taken. We can 
show that it is safe to augment the abstract system with strong and weak 
fairness conditions on the transitions of p. 



Lemma 2. For the eonerete WSIS system S and the abstraet system Sa 
we have: 






where 3 = G u).Oi[bx /bW for a generie abstraetion funetion a{ and S^ 
with strong fairness requirements Ca = I {Er-,Tr) G C} and 

weak fairness requirements Ja = {{bE^,bT^) \ G ^7}. O 



Example 2. Recall that we want to verify that our algorithm satisfies 
the mutual exclusion property as well as the universal property that each 
process p reaches its critical section infinitely often, i.e., VnP : □Oat_f' 2 [p]. 

According to the method presented in Section 4 we construct the 
abstract system Sa from the WSIS translation. For the mutual exclusion 
property we take as abstract variable inv = AtT 2 C Turn pi, j : (i G 
Turn A j G Turn) i = j. Our tool pax constructs the abstract system 
and provides translations to several input languages for model-checkers, 
e.g.. Spin and SMV. Also, the abstract state space can be explored to 
prove that inv is indeed an invariant of the abstract system and, hence, 
mutual exclusion holds for the original system. 

Next, using the marking algorithm, we augment Sa with the strong 
fairness requirements {(toi; {^ 2 o})> (^ 12 , {^oi})) (^ 20 ) {^ 12 })} to obtain a fair 
abstract system S^. Moreover, with Lemma 1 we can lift the strong fai- 
rness (eii,tii). Lemma 2 allows us to augment S^ with another strong 
fairness condition {bEn, &Tn) for the distinguished process as well as with 
the weak fairness 6 toi)- 

All the fairness conditions can be expressed as LTL formulae. We 
used Spin to prove that 006^^ holds in S^ which means that, in the 
original system, each process reaches its critical section infinitely often. 




302 



K. Baukus, Y. Lakhnech, and K. Stahl 



6 Conclusion 

We presented a method for the verification of universal properties of pa- 
rameterized networks. Our method is based on the transformation of an 
infinite family of systems into a single WSIS transition system and app- 
lying abstraction techniques on this system. To be able to prove liveness 
properties we presented a method to add fairness requirements to the ab- 
stract system. We have successfully applied this method, which has been 
implemented in our tool pax, to a number of parameterized protocols, 
including Dijkstra’s and Szymanski’s mutual exclusion protocol. 



References 



[ABJN99] 

[BBLSOO] 

[BCG89] 

[BLSOO] 

[Biic60] 

[CC77] 

[CGJ95] 

[CGL94] 

[DGG94] 

[Elg61] 

[EN96] 



P.A. Abdulla, A. Bouajjani, B. Jonsson, and M. Nilsson. Handling Glo- 
bal Conditions in Parameterized System Verification. In N. Halbwachs 
and D. Peled, editors, CAV ’99, volume 1633 of LNCS, pages 134-145. 
Springer, 1999. 

K. Baukus, S. Bensalem, Y. Lakhnech, and K. Stahl. Abstracting WSIS 
Systems to Verify Parameterized Networks. In S. Graf and M. Schwartz- 
bach, editors, TACAS’OO, volume 1785. Springer, 2000. 

M.G. Browne, E.M. Clarke, and O. Grumberg. Reasoning about net- 
works with many identical finite state processes. Information and Com- 
putation, 1989. 

K. Baukus, Y. Lakhnech, and K. Stahl. Verifying Universal Properties 
of Parameterized Networks. Technical Report TR-ST-00-4, CAU Kiel, 
2000 . 

J.R. Biichi. Weak Second-Order Arithmetic and Finite Automata. Z. 
Math. Logik Grundl. Math., 6:66-92, 1960. 

P. Cousot and R. Gousot. Abstract interpretation: A unified lattice 
model for static analysis of programs by construction or approximation 
of fixpoints. In fth ACM symp. of Prog. Lang., pages 238-252. ACM 
Press, 1977. 

E. Clarke, O. Grumberg, and S. Jha. Verifying Parameterized Networks 
using Abstraction and Regular Languages. In I. Lee and S. Smolka, 
editors, CONCUR ’95: Concurrency Theory, LNCS. Springer, 1995. 

E. M. Clarke, O. Grumberg, and D. E. Long. Model checking and ab- 
straction. ACM Transactions on Programming Languages and Systems, 
16(5), 1994. 

D. Dams, R. Gerth, and O. Grumberg. Abstract interpretation of re- 
active systems: Abstractions preserving AGTL*, EGTL* and GTL*. In 

E. -R. Olderog, editor. Proceedings of PROCOMET ’9f. North-Holland, 
1994. 

C.C. Elgot. Decision problems of finite automata design and related 
arithmetics. Trans. Amer. Math. Soc., 98:21-52, 1961. 

E. A. Emerson and K. S. Namjoshi. Automatic verification of para- 
meterized synchronous systems. In 8th Conference on Computer Aided 
Verification, LNCS 1102, pages 87-98, 1996. 




Verifying Universal Properties of Parameterized Networks 303 



[GS92] 

[HJJ+96] 

[JNOO] 

[KM89] 

[KMM+97] 

[Tho90] 

[WL89] 



S. M. German and A.P. Sistla. Reasoning about systems with many 
processes. Journal of the ACM, 39(3):675-735, 1992. 

J.G. Henriksen, J. Jensen, M. Jprgensen, N. Klarlund, B. Paige, 

T. Rauhe, and A. Sandholm. Mona: Monadic Second-Order Logic in 
Practice. In TAG AS ’95, volume 1019 of LNCS. Springer, 1996. 

B. Jonsson and M. Nilsson. Transitive closures of regular relations for 
verifying infinite-state systems. In S. Graf and M. Schwartzbach, editors, 
TACAS’OO, volume 1785. Lecture Notes in Computer Science, 2000. 
R.P. Kurshan and K. McMillan. A structural induction theorem for pro- 
cesses. In ACM Symp. on Principles of Distributed Computing, Canada, 
pages 239-247, Edmonton, Alberta, 1989. 

Y. Kesten, O. Maler, M. Marcus, A. Pnueli, and E. Shahar. Symbolic 
Model Checking with Rich Assertional Languages. In O. Grumberg, 
editor. Proceedings of CAV ’97, volume 1256 of LNCS, pages 424-435. 
Springer, 1997. 

W. Thomas. Automata on infinite objects. In Handbook of Theoretical 
Computer Science, Volume B: Formal Methods and Semantics, pages 
134-191. Elsevier Science Publishers B. V., 1990. 

P. Wolper and V. Lovinfosse. Verifying properties of large sets of pro- 
cesses with network invariants (extended abstract). In Sifakis, editor. 
Workshop on Computer Aided Verification, LNGS 407, pages 68-80, 
1989. 




Author Index 



Adlai'de, M., 252 
Agrawala, A., 121 
Altisen, K., 106 
Androutsopoulos, K., 46 
Arora, A., 82 

Back, R.J., 202 
Baukus, K., 291 
Benveniste, A., 134 
Bhattacharjee, A.K., 152 
Breitling, M., 58 

Caspi, P, 70 
Clark, D., 46 

D’Souza, D., 240 
Damm, W., 18 
Dams, D., 276 
Dhodapkar, S.D., 152 

Geilen, M., 276 
Gofiler, G., 106 
Guelev, D. P., 264 

Halbwachs, N., 1 
Hansson, H., 94 
Hery, J.-F., 1 
Hayes, I., 170 
Holenderski, L., 214 

Jensen, H. E., 19 

Kan, P., 46 
Karunakar, K., 152 
Kulkarni, S., 82 



Lakhnech, Y., 291 
Laleuf, J.-C., 1 
Lano, K., 46 
Larsen, K. G., 19 

van der Meyden, R., 185 
Moses, Y., 12, 185 

Nadjm- Tehran!, S., 134 
Nicollin, X., 1 
Nissanke, N., 228 
Norstrom, C., 94 

Petre, L., 202 
Porres, I, 202 
Punnekkat, S., 94 

Rajan, B., 152 
Roux, O., 252 

Salem, R, 70 
Sen, G., 152 

Shyamasundar, R.K., 152 
Sifakis, J., 106 
Skou, A., 19 
Sproston, J., 31 
Stahl, K., 291 
Stromberg, J.-E., 134 
Subramani, K., 121 

Tudoret, S., 134 

Veloudis, S., 228 




